Transparent AI Approach
IN all its AI-related projects, DW promotes and is committed to following transparency principles
Explainability
Transparency
Fairness
Robustness
Privacy
Governance
What is AI Robustness?
Robustness refers to the overall resilience of an AI model against various forms of attacks.
AI powered media systems could be subject to attack by malicious actors, who aim to manipulate the system’s AI model to take control or influence its behaviour and results. For example, an adversary could obtain access to the deployed model of the AI component and perform very minor imperceptible alterations to the input data to significantly reduce the accuracy of such a model. Robustness Evaluation entails an assessment of the model’s vulnerability to different types of attacks as well as the testing of defence mechanisms to see how well they perform under attack and how they change the model’s level of accuracy.
There are different types of adversarial attacks on AI systems. They can relate to the intent and technique employed by the attacker (such as evasion, poisoning, extraction, or inference), or they can be categorised by the attacker’s level of knowledge and access to the targeted AI system. In the case of a white-box attack for instance, it is assumed the attacker has obtained full access to and good knowledge of the AI model. For black-box attacks, such access is limited but the attacker can still succeed in influencing the model by sending information to it and receiving information from it. For example, a public, or otherwise accessible API of the service, would facilitate an attacker to query it with some input data and receive a result, thus enabling them to determine inputs that could deceive the model. Whilst the risk for white-box attacks can be reduced by implementing respective IT security provisions to restrict access to and knowledge of the model, the same provisions cannot be extended to protect against black-box attacks, which will always require consideration whenever a model is being deployed.
Robustness and Trustworthy AI
Robustness is one of the Trustworthy AI principles, together with Transparency, Fairness, Privacy, Explainability and Governance. These principles are designed to increase AI-related trust, user acceptance and security, but also support managerial assessment, AI implementation, and legal compliance. The trustworthiness of an AI component can be enhanced by applying algorithmic trustworthy AI tools to the AI model and by supplementing such evaluations with documentation in the form of accessible Model Cards or Fact Sheets which openly discuss the vulnerabilities and limitations of the model in an understandable and meaningful way.
Robustness is an important stand-alone element but is also connected with other principles such as Governance and Transparency. After conducting a Robustness Evaluation at the component level, it is important to also provide respective documentation in a Model Card or Fact Sheet. Only this will achieve Transparency for end-users and managers regarding the level of resilience and security of an AI component. It also enables Governance at the corporate level, for example when assessing the compliance with AI Guidelines.
Image by Google DeepMind auf Unsplash
Towards Transparent Robustness in AI4Media
Deutsche Welle (DW) runs a practical use case in the EU co-funded AI4Media project together with ATC iLab. Based on our requirements, we received a set of advanced AI functionalities from several technology partners in the project for integration and testing in a demonstrator version of the Truly Media platform for content verification. This included an AI powered Deepfake Detection service from the MeVer Group at CERTH-ITI (CERTH).
While the prediction given by this AI service can assist users with verifying a suspicious video, the final decision of whether it is synthetically generated or manipulated remains with the human analyst in a media organisation. For this reason, it was DW’s goal to explore how the level of trust in this new component can be increased, for end users such as verification specialists, but also media managers who need to assess new AI components in the context of AI Guidelines or for commercial integration into an existing media tool. Trust also relates to knowing how robust the Deepfake Detection service stays in case it is the subject of an adversarial attack.
Following DW’s requirements, the Deepfake Detection component was evaluated and enhanced in terms of Robustness by the component owner CERTH in close cooperation with the expert partner in AI4Media for algorithmic trustworthy AI technologies, IBM Research Europe – Dublin (IBM). DW then developed requirements to assist IBM in producing the right kind of transparency information: technical information for the component’s Model Card and more business-oriented input for a co-authored User Guide for media managers in non-technical language.
Seven Steps: Robustness Evaluation and Related Transparency
The workflow to conduct a Robustness Evaluation and develop suitable transparency information consisted of seven steps, which are summarised below:
1. Ensuring that there is a technological match between this AI component and IBM’s algorithmic robustness evaluation tool.
2. Integrating IBM’s open source Adversarial Robustness Toolbox (ART) into the processing pipeline of the Deepfake Detection component.
3. Subjecting the datasets used by the model of the Deepfake Detection service deliberately to a white-box attack, using a Projected Gradient Descent attack and measuring the results.
4. Conducting a black-box attack, using a HopSkipJump Here it is simulated that an attacker influences the output predictions of the Deepfake Detection service for a specific input video, by sending information to and receiving information from the model. Therefore, slowly learning which alterations of the Deepfake video are required to evade detection by the AI service.
5. Identifying and testing defense mechanisms such as JPEG Compression or Spatial Smoothing that may protect the model from adversarial attacks but can also impact on the model’s accuracy levels.
6. Describing the simulated attacks and their influence in the Model Card of the Deepfake detection service to make results of the Robustness Evaluation
7. Developing a User Guide for managers in non-technical language that allows for an assessment of the component’s level of resilience/security, by explaining AI Robustness, the stakeholders and processes involved for this AI component, possible security scenarios and the outcome of the Robustness Evaluation.
Lessons Learned
This explorative work in the AI4Media project showed the value of assessing vulnerabilities and the robustness of AI components used in media tools. Describing the results of such a Robustness Evaluation increases the level of trust in an AI component, especially related to the Trustworthy AI principles of Transparency and Governance. Such descriptions can be provided for both technical audiences (e.g., in a Model Card), but also for other stakeholders in the media organisation, using a business-oriented approach and non-technical language.
Another learning was related to the many stakeholders involved when an AI component is developed by a third party, deployed in a media tool that is operated by an external technology provider and then used within the media organisation. This demonstrated potential security interfaces in the context of attack scenarios, but also the need for tailored AI transparency information for different target groups.
About AI4Media
With 30 European partners from the media industry, research institutes, academia, and a growing network of stakeholders, the EU co-funded AI4Media research project has several dimensions: it conducts advanced research into AI technologies related to the media industry, develops Trustworthy AI tools, integrates research outcomes in seven media related use cases, analyses the social and legal aspects of AI in Media, runs a funding programme for applied AI initiatives and establishes the AI Doctoral Academy AIDA.
For resources from the AI4Media project, visit the results section on the project’s website, containing White Papers from the use cases, an in-depth report on AI & Media, as well as specific reports on legal, social, and trustworthy AI aspects. The project also provides open data sets and AI components via Europe’s AI-On-Demand Platform.