Lessons from Archives: Strategies for Collecting Sociocultural Data in ML

Algorithmic Bias and Fairness

Strategies for Collecting Sociocultural Data in Machine Learning

WHEN: Monday 3 February 2020 at 1.30 pm
WHERE: Aula 1A150, Math Dept. "T. Levi-Civita"

A growing body of work shows that many problems in fairness, accountability, transparency, and ethics in machine learning systems are rooted in decisions surrounding the data collection and annotation process. We argue that a new specialization should be formed within machine learning that is focused on methodologies for data collection and annotation: efforts that require institutional frameworks and procedures. Specifically for sociocultural data, parallels can be drawn from archives and libraries.

Archives are the longest standing communal effort to gather human information and archive scholars have already developed the language and procedures to address and discuss many challenges pertaining to data collection such as consent, power, inclusivity, transparency, and ethics privacy. We discuss these five key approaches in document collection practices in archives that can inform data collection in sociocultural machine learning.

conference speaker

Timnit Gebru

Timnit Gebru is the technical co-lead of the Ethical Artificial Intelligence Team at Google. She works on algorithmic bias and data mining. Timnit earned her doctorate under the supervision of Fei-Fei Li at Stanford University in 2017. She is an advocate for diversity in technology and is the cofounder of Black in AI, a community of black researchers working in artificial intelligence.

Next Future Transportation

Next Future Transportation

WHEN: Tuesday 14 January 2020 at 2.30 pm
WHERE: Aula Rostagni, Physics and Astronomy Dept.

NEXT journey is a personal, scientific and professional venture. Starting from a scientific perspective it became a real product and service from Italy to Silicon Valley and Dubai. NEXT is a new mobility paradigm, based on modular vehicles that can be a taxi, a bus and an in motion connection hub infrastructure, creating a real time flexible network that optimize ubiquity, price, traffic, energy consumption and comfort.

conference speaker

Tommaso Gecchelin

Tommaso Gecchelin is a physicist and industrial designer from Padua, in Italy. He’s the founder, inventor and CTO of Next Future Transportation inc. developing modular autonomous pods. His mission is to merge science and design to create useful and elegant solutions to solve major world problems in the field of logistics and transportation, politics and decision making, environment and sustainability.

Artificial Intelligence and Big Data: a Paradigm Shift for All

Artificial Intelligence and Big Data:
a Paradigm Shift for All

Artificial Intelligence (A.I.) was born in 1956, at a major conference in computer science during which Professors John McCarthy, Marvin Minsky, Herbert Simon, among others, compiled a list of objectives for the following years goals of the research in AI. 20 years later, most of these goals were still beyond reach. Now, most of us perform the A.I. 1956 wish list on a daily basis using our smart phone.

What has made this achievement possible? Contemporary A.I. is powered by statistical methods and fuelled by large amount of data, known as Big Data. Big data and machine learning impact on business as well as on other several aspects of society, prompting numberless and umpredictable changes in legal, economic, and cultural fields.

With this introductory seminar, we aim to provide an overview of the paradigm shift occurred in the field of AI and of the AI-induced radical transformations that are moving ahead at full speed, showcasing the main challenges to which the lecture series will later devote a specific focus.

Speakers: Andrea Pin and Samir Suweis

Under the Hood of Big Data and Artificial Intelligence

Under the Hood of Big Data and Artificial Intelligence

Big Data and Artificial Intelligence have nowadays a huge impact in almost all scientific disciplines and, more in general, to our society.

Many are claiming that data is the “new oil”, and the data-driven / machine-learning paradigm is changing how we address many different problems. Self-driving cars, robot caregivers and chatbot platforms are really happening, while they were only popular sci-fi topics until a few years ago. Deep Neural Network architectures – trained on very large datasets (e.g. ImageNet) onto fast dedicated hardware (e.g. GPUs) – are the most popular approach and the main reason why machines can now recognize objects in an image, and translate speech in real time.

However, despite the impressive achievements of these technologies, big data and AI are often used and referred as to “black boxes”. The aim of this lecture is to introduce the key concepts that have led to the recent success of these techniques, to highlight what are the main challenges and open problems, thus trying to unveil what’s in the box.

Speaker: Lamberto Ballan

When: 9 November 2018, 12.30

Where: Aula Rostagni, Physics and Astronomy Dept., Via Paolotti 9, 35131, Padova

Check out the video of the presentation!


Here the slide with references:

Slide 2018_TLBD_ballan.pdf

The Unconstitutional Algorithm: Artificial Intelligenceand the Future of Fundamental freedoms

The Unconstitutional Algorithm: Artificial Intelligenceand the Future of Fundamental freedoms

Contemporary A.I. is powered by entirely new statistical methods and fuelled by a large amount of data, known as Big Data. Big data and machine learning impacts on business as well as on other several aspects of society, are prompting challenging shifts in decision-making processes, legal one included as some recent cases demonstrated.

An increasing role it is played by so called “predictive algorithms”, sort of contemporary oracles able to predict accurately the future but never saying why. In the lecture, I argue that constitutional law – if it wants to keep its own aim, that is regulating powers in order to protect freedom – must reflect on when and how to limit the asymmetry of power inevitably caused by differences in capability of computational power and data availability and on the discriminatory outcomes produced by decisions taken relying on biased algorithms.

By presenting some recent cases, I will highlight some critical aspects of the application of AI on judicial and administrative decisions with a particular focus on the problem of training bias. I’ll conclude reflecting on the radical transformations that are moving ahead at full speed, showcasing the main challenges that only through interdisciplinary approaches can be effectively tackled and outlining some of the fundamental principles of the “new constitutional law of the cybernetic era”

Speaker: Andrea Simoncini, University of Firenze

Date: 07/12/2018 12.30. Aula A, Physics and Astronomy Dept.

Big Data Governance Beyond Personal Data

Big Data Governance Beyond Personal Data

The echo of the new GPDR gives the impression that the legal issues about big data will concentrate on personal data. The approach is mainly due to the acknowledged importance of the EUCFR, article 8, which suggests and consolidates this idea of data governance.

Big data, however, impact other different domains, going beyond personal data. This perspective, therefore, risks being misleading and neglects other relevant aspects.

Data, indeed, have significant legal interests for different reasons (e.g. openness and transparency of government or secrecy for security or commercial purposes) and their informational and computational power may affect further relevant fundamental rights.

The lecture aims to analyze this broader scenario, considering these other themes historically related to data protection (e.g. copyrights and trade secrets; database establishment; public/private re-use of information; intelligence issues; non-personal data regulation).

The final objective is to offer a more clear and comprehensive understanding of the current state of the art, highlighting the possible developments in the regulation of these fields.

Introduce: Andrea Pin

Speaker: Elisa SpillerUniversity of PaduaDipartimento Di Diritto Pubblico Internazionale ComunitarioPhD student

When: 28/02/2019, 12:30.


The Era of Mass Data Litigation

The Era of Mass Data Litigation

A new wave of mass litigation has begun all over the world as far as the processing of big data is concerned: many companies operating social networks or providing services through the web are allegedly violating the privacy rights of myriads of people and  therefore being named in civil proceedings.

While data breaches may cause  limited, if any, damage to a single individual (making them unlikely to bring individual claims), aggrieved groups affected by the breach may seek compensation for the aggregate damage via class actions. Recently Mr. Schrems, a young Austrian activist and lawyer, brought an action before the Regional Civil Court of Vienna against Facebook Ireland Limited, alleging that the defendant had committed numerous infringements of data protection provisions and seeking  compensation for the damages suffered by him and thousands of other Facebook users. In addition to  this lawsuit, similar proceedings against Facebook, Whatsapp, Instagram, and Google have been commenced in France, Belgium, Germany and also in Italy.

In the United States, Facebook Inc. and the political consulting firm Cambridge Analytica have been sued for obtaining information from 50 million of the social media company’s users without permission. Facebook is also facing a  class action in the Northern District of California for other alleged misconduct including unfair and fraudulent business practices, consumer bait-and-switch, and  invasion of privacy.

Furthermore, focusing on cryptocurrencies, Mt. Gox, one of the biggest bitcoin exchanges based in Tokio, is facing bankruptcy procedures both in Japan and the United States after the loss of 850,000 bitcoins, valued at more than $450 million at the time, stolen due to a massive hack in 2014. In September 2018 Mt.Gox’s bankruptcy and rehabilitation trustees reported that they had collected more than 617 million dollars, which could be sufficient to reimburse claims completely and invited the investors, in accordance with the Civil Rehabilitation Act of Japan, to file rehabilitation claims. After this announcement the US investors have asked the federal court in California for a stay in the case sued against Mt.Gox until February 28, 2019 in order to get a clear picture over whatever they are going to  be compensated in full or partially.

This seminar is  an opportunity for lawyers, scholars, and students to discuss, study and reflect on the main procedural issues that have emerged in these cases both with regards to the problems inherent to the aggregate treatment of individual claims and with reference to the new challenges imposed by the processing of massive amounts of personal data in the digital age.

When: 22/03/2019, 10:30/12:30.


Clicke here to download the poster of The Era of Massa Data Litigation.

   Click on the title to download the slideshow presentation:

Italian class action litigation – B. Zuffi

The Multidistrict litigation against facebook – L. Ferrarese

Users protection in the cryptocurrencies – F. Viggiani

Class Action Suits – G. Gioia

The Rise of Machines and the Disruption of Law

The Rise of Machines and the Disruption of Law

This lecture outlines how machines are coming both to disrupt the legal profession and to change the optimal form of law.  The first section describes the relentless growth of computational capacity. The second section maps five areas in which machine intelligence will provide legal services:  discovery, legal search, document generation, brief generation, and prediction of case outcomes. The third section shows how these developments will create unprecedented competitive pressures in many areas of lawyering and an unprecedented age of innovation in legal services.

Introduce: Andrea Pin
Speaker: John McGinnis

John O. McGinnis is a graduate of Harvard College and Harvard Law School where he was an editor of the Harvard Law Review. He also has an MA degree from Balliol College, Oxford, in philosophy and theology. Professor McGinnis clerked on the U.S. Court of Appeals for the District of Columbia. From 1987 to 1991, he was deputy assistant attorney general in the Office of Legal Counsel at the Department of Justice. He is the author of Accelerating Democracy: Transforming Government Through Technology (Princeton 2013) and Originalism and the Good Constitution (Harvard 2013) (with M. Rappaport). He is a past winner of the Paul Bator award given by the Federalist Society to an outstanding academic under 40. He has been listed by the United States on the roster of panelists who may be called upon to decide World Trade Organization Disputes.

When: 8/05/2019, 12:30/14:30.


Clicke here to download the poster of The Rise of Machines and the Disruption of Law.

Just Machine Learning in Unjust World?

Just Machine Learning in Unjust World?

Artificial intelligence systems are meant to transcend some of the imperfections of the human mind, being grounded in mathematics and operating on data rather than emotion or subjective perception. But as more algorithms are weaved into daily life, the limits of their objectivity are being revealed. Algorithms can amplify the biases that we have in society.

For example, a ProPublica investigation last year found that a private software used to predict future criminals was biased against black people. In this talk these bias-related problems will be illustrated and will be shown as part of the problem in creating fair algorithms is the concept of fairness itself.

What’s considered fair and precise in the field of computer science may not translate well to justice in the real world. One way to address this problem is by putting computer scientists into conversation with ethicists, philosophers, and others from fields that have historically examined justice and fairness.

Speaker: Tina Eliassi-Rad


5 Reasons Why Social Networks Make Us Vulnerable to Misinformation

5 Reasons Why Social Networks Make Us Vulnerable to Misinformation

As social media become major channels for the diffusion of news and information, it becomes critical to understand how the complex interplay between cognitive, social, and algorithmic biases triggered by our reliance on social networks makes us vulnerable to disinformation. This talk overviews ongoing network analytics, modeling, and machine learning efforts to study the viral spread of misinformation and to develop tools for countering the online manipulation of opinions.

Joint work with collaborators at the Center for Complex Networks and Systems Research (cnets.indiana.edu) and the Indiana University Network Science Institute (iuni.iu.edu). This research is supported in part by the National Science Foundation, McDonnell Foundation, DARPA, Yahoo, and Democracy Fund. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of these funding agencies.

Speaker: Prof. Filippo Menczer, Indiana University.

The seminar will be in AULA B at the Physics and Astronomy Dept., Via Paolotti 9, 35131, Padova.