Title: “Using Artificial Intelligence to preserve audiovisual archives: new horizons, more questions” by Jean Carrive, INA, France (institut.ina.fr/en)
Abstract: France has a long tradition of preserving its archives as well as its cultural heritage, as demonstrated by the “legal deposit”. Established in the Renaissance for printed documents, the legal deposit aims to allow the collection and consultation of various kinds of documents. INA, the French National Audiovisual Institute, is in charge of this task for France’s radio and television, as well as French media on the web. INA’s mission is to make the most of its collections: commercially by selling programs, and academically by making these collections available to researchers working on humanities and social sciences.
Since its creation in 1975, INA has constantly developed its tools and methodologies for describing and documenting its collections: databases, thesauri, lexicons, documentation software, indexing methods indexing, search engines, etc. Its Research and Innovation Department has for many years been interested in partnering with academic laboratories to explore the possibilities of automatic content analysis technologies. The emergence of AI-derived technologies is now making it possible to consider new uses of these collections, but also raises new questions.
INA has thus demonstrated that it now becomes possible to mass-treat large audiovisual corpora to identify various kind of information, thus facilitating indexing, documentation and search in order to provide better services to users. For researchers in humanities and social sciences working on these resources at Inathèque de France, these new means of analysis allow to conduct new types of Digital Humanities investigations, but also introduce new methodological challenges. For INA’s archivists and librarians, AI’s assistance facilitates the documentation process but also poses questions about the impact of these technologies on professional practices, as well as on the scalability of these technologies over time.
The presentation will address these questions, building on the research projects and experiments carried out at INA.
Bio: Jean Carrive holds a Master Degree in Artificial Intelligence, Pattern Recognition and Application, and a PhD in Computer Sciences from University Pierre and Marie Curie, Paris 6 (now Sorbonne University). In his thesis “Classification of Audiovisual Sequences”, he combined symbolic techniques from description logics on the one hand and constraint satisfaction techniques s on the other hand. He is now Deputy Head of Research and Innovation Department of the French National Institute of Audiovisual (INA, institut.ina.fr), a public institution dedicated to the preservation and the valorization of the French audiovisual heritage. He participated in or conducted several French or European projects in the area of automatic analysis of audiovisual contents: DiVAN, QUAERO, K-Space, InfoM@gic. He also supervised several PhD theses in this domain, in collaboration with academic institutions.
He now participates in the MeMAD H2020 Project (Methods for Managing Audiovisual Data, memad.eu), which aims to develop automatic tools to facilitate access to audiovisual content, for example for people with disabilities. In the field of digital humanities, he is particularly interested in the application of audiovisual content analysis technologies for historical and heritage uses. In this regard, he is involved in the French ANTRACT project (Transdisciplinary Analysis of French Newsreel, antract.hypotheses.org), which brings together historians and researchers in computer science with the aim of proposing a cross-approach on an emblematic audiovisual collection of the mid-twentieth century.
Title: “EU Data Protection Law: An ally for scientific reproducibility?” by Mireille Hildebrandt, Vrije Universiteit Brussel, Belgium and Radboud University, Netherlands
Abstract: This keynote will introduce some of the key concepts of European data protection law, and clarify how and why this is not equivalent with privacy law. Next, I will explain why and how EU data protection law could enhance the methodological integrity of machine learning applications, also in the domain of multimedia.
The question is, first, how the General Data Protection Regulation (GDPR) applies to inferences captured from multimedia data. This raises a number of questions. Does it matter whether such data has been made public by the person it relates to? Does processing personal data always require consent? What counts as valid consent? What if the inferences are mere statistics? What does the prohibition of processing ‘sensitive data’ (ethnicity, health) mean for multimedia analytics? This keynote will provide a crash course in the underlying ‘logic’ of the GDPR, with a focus on what is relevant for inferences based on multimedia content and metadata. I will uncover the purpose limitation principle as the guiding rationale of EU data protection law, protecting individuals against incorrect, unfair or unwarranted targeting.
In the second part of the keynote I will explain how the purpose limitation principle relates to machine learning research design, requiring keen attention to specific aspects of methodological integrity. These may concern p-hacking, data dredging, or cherry picking performance metrics, and connect with the reproducibility crisis in machine learning that is on the verge of destroying the reliability of ML applications.
Bio: Mireille Hildebrandt is a Research Professor on ‘Interfacing Law and Technology’ at Vrije Universiteit Brussel (VUB), appointed by the VUB Research Council. She is co-director of the research group on Law Science Technology and Society studies (LSTS) at the Faculty of Law and Criminology. She also holds the part-time Chair of ‘Smart Environments, Data Protection and the Rule of Law’ at the Science Faculty, at the Institute for Computing and Information Sciences (iCIS) at Radboud University. She has been teaching law to master students of computer science for the past eight years, resulting in the first serious introduction into law for ‘computer scientists and other folk’, to be published in open access by OUP later this year. See the MIT pubpub site for the open review of the manuscript.
Her research interests concern the implications of automated decisions, machine learning and mindless artificial agency for law and the rule of law in constitutional democracies. Hildebrandt has published 4 scientific monographs, 22 edited volumes or special issues, and over 100 chapters and articles in scientific journals and volumes. Her most recent monograph is ‘Smart Technologies and the End(s) of Law’ (Edward Elgar 2015). In 2018 she received an ERC Advanced Grant for her project on ‘Counting as a Human Being in the era of Computational Law’ (2019-2024): www.cohubicol.com, which will bring together a legal team from VUB with a CS team from Radboud University to explore the mine-field of e.g. quantified legal prediction and self-executing contracts.