Chinese tech giant Alibaba has unveiled two advanced open-source AI models named Qwen-VL and Qwen-VL-Chat. These models excel in visual recognition and have the ability to engage in more intricate conversations compared to their predecessors. Alibaba has ingeniously combined visual recognition capabilities with conversational AI, resulting in a tool that enables location-specific inquiries. Qwen-VL is a pre-trained large vision language model that can comprehend images, texts, and bounding boxes in prompts and facilitate multi-round question answering in both English and Chinese. It is based on Alibaba Cloud’s 7 billion parameter model, Qwen-7B. Qwen-VL is the first generalist model supporting grounding in Chinese, detecting bounding boxes through open-domain language expression in both Chinese and English Conversely, the Qwen-VL-Chat model is tailored for conducting intricate conversations and delivering location-specific information to users. These models are poised to enhance the user experience by offering more precise and pertinent information.
Why does it matter?
The use of AI models with enhanced visual localisation capabilities, such as the ones launched by Alibaba, poses challenges, mainly on data protection and privacy.
Data protection and privacy: The utilisation of AI models with visual localisation features often involves the gathering and analysis of substantial datasets. AI models equipped with visual localisation capabilities raise these concerns as they may involve the processing of sensitive information like images of individuals and their whereabouts, potentially enabling identification and surveillance. Thus, It is imperative to ensure that these datasets are rigorously protected to prevent unauthorised access, utilisation, or disclosure. Stringent data protection regulations like GDPR and the California Consumer Privacy Act (CCPA) grant consumers enhanced privacy rights, including the right to safeguard their data.