King Abdullah University of Science and Technology
We are focusing on Trustworthy Machine Learning and AI for Science. In particular, we are interested in making Machine Learning and Generative AI have controllable and editable memorization, such as the memorization of data, knowledge, and concept, and its application to AI safety such as hallucination and copyright infringement:
Differential Privacy is a method used to protect people’s private information when analyzing large sets of data. It ensures that anyone’s personal data can’t be identified, even when data is shared or used for research. The main idea is to make sure that whether your data is included or not, the results of the data analysis won’t be noticeably different. This way, your privacy is protected because your individual information is hidden within the crowd.
Research Topics: private stochastic optimzation, private statistical estimation, DP-SGD, DP large model pre-training and fine tuning, privacy amplification, and privacy audit. We are also interested in privacy leackage attacks such as inference and reconstruction attacks.
Imagine teaching a computer to recognize different animals using pictures. Over time, the computer gets better at identifying animals by learning from many images. However, if one of those images needs to be removed for privacy reasons, Machine Unlearning helps the computer forget what it learned from that specific image without having to start learning from scratch with all the other images.
The main goal of Machine Unlearning is to ensure that data can be removed effectively and that the influence of this data is erased from the model. This is crucial for scenarios where individuals request the deletion of their personal data, as required by laws like the General Data Protection Regulation (GDPR).
Research Topics: large model unlearning, unlearning for different types of data such as graphs, federated unlearning, unlearning evaluation, theory of machine unlearning.
Adversarial robustness is about making machine learning systems stronger and more reliable against tricks designed to fool them. These tricks, called adversarial attacks, involve making tiny, almost invisible changes to inputs like images or text to cause the system to make mistakes. For example, an attack might subtly alter a picture of a dog so a system misidentifies it as a cat, even though the change is imperceptible to humans. Adversarial robustness involves developing techniques and training methods to help these systems recognize and resist such manipulations. This is crucial for ensuring the security and dependability of AI in important areas like facial recognition, self-driving cars, and fraud detection, where mistakes can have serious consequences.
Research Topics: certified robustness, theory of adversarial robustness, adversarial training. We are also interested in adversarial and backdoor attacks.
Concept erasing is a technique used in machine learning to remove specific concpet including object, knowledge or biases from a trained model. Imagine teaching a computer to recognize various objects in pictures. Sometimes, the model might learn unwanted associations, like linking gender with certain professions. Concept erasing aims to “forget” these unwanted connections without affecting the model’s ability to perform its main tasks. This is done to ensure the AI behaves fairly and ethically, avoiding biases that could lead to discrimination or incorrect predictions. For example, if a model unfairly associates certain jobs with a specific gender, concept erasing would remove this bias, promoting more accurate and unbiased outcomes. This technique is important for creating fair, reliable, and ethical AI systems that make decisions based on relevant information rather than learned prejudices.
Research Topics: fairness, locality, concept removal in diffusion models.
Explainable AI (XAI) is a branch of artificial intelligence focused on making AI systems’ decisions transparent. It makes the black-box deep learning models to concepts that are understandable to humans. Imagine using a smart assistant that helps you with tasks like recommending books or suggesting routes for travel. Sometimes, its suggestions might seem puzzling. Explainable AI aims to provide clear reasons for these choices, making it easier to understand why the AI made certain recommendations. This is crucial in fields like healthcare or finance, where understanding the reasoning behind AI decisions can be critical. For example, if an AI recommends a particular medical treatment, XAI would enable doctors and patients to see the data and logic that led to this recommendation, ensuring transparency and trust. By making AI decisions more transparent, XAI helps users trust and verify the system’s fairness and accuracy, ultimately leading to more reliable and ethical AI applications.
Research Topics: concept models, faithfulness in XAI, XAI for Generative AI and large models.
Knowledge Editing involves updating or correcting specific information within these AI models without retraining them from scratch. LLMs, like those used in chatbots or virtual assistants, learn from vast amounts of text data to generate human-like responses. However, they can sometimes retain outdated or incorrect information. Knowledge Editing allows us to directly modify these specific pieces of information within the model, ensuring it remains accurate and reliable. For instance, if an LLM incorrectly states that a particular event occurred in 2020 instead of 2021, Knowledge Editing can correct this error without affecting the model’s overall functionality. This technique is essential because it enables the efficient and precise updating of LLMs, maintaining their relevance and accuracy without the time-consuming and resource-intensive process of full retraining. By employing Knowledge Editing, we can ensure that LLMs continue to provide correct and current information, enhancing their trustworthiness and utility in applications such as customer support, education, and content generation.
Research Topics: knowledge editing and unlearning in Large Language Models and Multimodal Models.
Optical Neural Networks are a type of artificial intelligence that use light, rather than electricity, to perform computations. Traditional neural networks rely on electronic circuits to process information, which can be slow and consume a lot of power. In contrast, optical neural networks use light beams and optical components like lenses, mirrors, and waveguides to carry out the same tasks. This allows them to process data at much higher speeds and with greater energy efficiency. For example, in applications requiring rapid image or signal processing, optical neural networks can significantly outperform their electronic counterparts. This technology leverages the properties of light, such as its ability to travel fast and in parallel, to perform complex calculations quickly and efficiently. As a result, optical neural networks hold great potential for advancing fields that require high-speed data processing, including telecommunications, medical imaging, and real-time video analysis. By harnessing the power of light, optical neural networks offer a promising path toward faster, more efficient AI systems.
Research Topics: optoelectronics hardware acceleration.
Color is one of the most effective means to identify, classify and perceive objects in both human and animals. Due to the biological nature, human naked eyes are more sensitive to red, green and blue (RGB) colors. Colorimetric sensing and imaging is a golden standard in biomedical diagnostic areas. Unfortunately, human’s naked eye is incapable of identifying tiny color changes from different bio/chemical materials. Therefore, optical spectroscopy tools are required. However, benchtop systems are usually bulky, expensive, and mainly designed for laboratory and industrial spectroscopic analysis. . In recent years, it has been a trending topic in both academia and industry to develop miniaturized, portable, and inexpensive spectrometer systems. Unfortunately, due to the over-simplified optical design and mechanical limit of compact architectures, the actual performance of miniaturized spectrometer systems is usually much lower than that of their benchtop counterparts. Therefore, strategies to enhance the color identification capability of miniaturized systems are highly desired. Our lab will integrate advanced ML algorithms with nanophotonics chips to address the challenge in color recognition of the hardware (e.g. the emerging portable spectroscopy devices and systems).