When U.S. tech firm OpenAI rolled out Whisper, a speech recognition tool offering audio transcription and translation into English for dozens of languages including Māori, it rang alarm bells for many Indigenous New Zealanders.
Whisper, launched in September by the company behind the ChatGPT chatbot, was trained on 680,000 hours of audio from the web, including 1,381 hours of the Māori language.
Indigenous tech and culture experts say that while such technologies can help preserve and revive their languages, harvesting their data without consent risks abuse, distorting of Indigenous culture, and depriving minorities of their rights.
“Data is like our land and natural resources,” said Karaitiana Taiuru, a Māori ethicist and an honorary academic at the University of Auckland.
“If Indigenous peoples don’t have sovereignty of their own data, they will simply be re-colonised in this information society.”
OpenAI did not respond to a request for comment.
Data is like our land and natural resources. If Indigenous peoples don’t have sovereignty of their own data, they will simply be re-colonised in this information society.
Karaitiana Taiuru, Māori ethicist, University of Auckland
It said it collaborates “with industry leaders and policymakers to ensure that AI systems are developed in a trustworthy manner” in a statement on its website.