Millions of Workers Are Training AI Models for Pennies

From the Philippines to Colombia, low-paid workers label training data for AI models used by the likes of Amazon, Facebook, Google, and Microsoft.
Oskarina Vero Fuentes
Oskarina Vero Fuentes with her dog.Courtesy of Oskarina Vero Fuentes

In 2016, Oskarina Fuentes got a tip from a friend that seemed too good to be true. Her life in Venezuela had become a struggle: Inflation had hit 800 percent under President Nicolás Maduro, and the 26-year-old Fuentes had no stable job and was balancing multiple side hustles to survive.

Her friend told her about Appen, an Australian data services company that was looking for crowdsourced workers to tag training data for artificial intelligence algorithms. Most internet users will have done some form of data labeling: identifying images of traffic lights and buses for online captchas. But the algorithms powering new bots that can pass legal exams, create fantastical imagery in seconds, or remove harmful content on social media are trained on datasets—images, video, and text—labeled by gig economy workers in some of the world’s cheapest labor markets.

Appen’s clients have included Amazon, Facebook, Google, and Microsoft, and the company’s 1 million contributors are just a part of a vast, hidden industry. The global data collection and labeling market was valued at $2.22 billion in 2022 and is expected to grow to $17.1 billion by 2030, according to consulting firm Grand View Research. As Venezuela slid into an economic catastrophe, many college-educated Venezuelans like Fuentes and her friends joined crowdsourcing platforms like Appen.

For a while, it was a lifeline: Appen meant Fuentes could work from home at any hour of the day. But then the blackouts started—power cutting out for days on end. Left in the dark, Fuentes was unable to pick up tasks. “I couldn't take it anymore,” she says, speaking in Spanish. “In Venezuela, you don't live, you survive.” Fuentes and her family migrated to Colombia. Today she shares an apartment with her mother, her grandmother, her uncles, and her dog in the Antioquia region.

Appen is still her sole source of income. Pay ranges from 2.2 cents to 50 cents per task, Fuentes says. Typically, an hour and a half of work will bring in $1. When there are enough tasks to work a full week, she earns approximately $280 per month, almost meeting Colombia’s minimum wage of $285. But filling out a week with tasks is rare, she says. Down days, which have become increasingly common, will bring in no more than $1 to $2. Fuentes works on a laptop from her bed, glued to her computer for over 18 hours a day to get the first pick of tasks that could arrive at any time. Given Appen’s international clients, days begin when the tasks come out, which can mean 2 am starts.

It’s a pattern that’s being repeated across the developing world. Labeling hot spots in east Africa, Venezuela, India, the Philippines, and even refugee camps in Kenya and Lebanon’s Shatila camps offer cheap labor. Workers pick up microtasks for a few cents each on platforms like Appen, Clickworker, and Scale AI, or sign onto short-term contracts in physical data centers like Sama’s 3,000-person office in Nairobi, Kenya, which was the subject of a Time investigation into the exploitation of content moderators. The AI boom in these places is no coincidence, says Florian Schmidt, author of Digital Labour Markets in the Platform Economy. “The industry can flexibly move to wherever the wages are lowest,” he says, and can do it far quicker than, for example, textile manufacturers.

Some experts see platforms like Appen as a new form of data colonialism, says Saiph Savage, director of the Civic AI lab at Northeastern University. “Workers in Latin America are labeling images, and those labeled images are going to feed into AI that will be used in the Global North,” she says. “While it might be creating new types of jobs, it's not completely clear how fulfilling these types of jobs are for the workers in the region.” Due to the ever moving goal posts of AI, workers are in a constant race against the technology, says Schmidt. “One workforce is trained to three-dimensionally place bounding boxes around cars very precisely, and suddenly it's about figuring out if a large language model has given an appropriate answer,” he says, regarding the industry’s shift from self-driving cars to chatbots. Thus, niche labeling skills have a “very short half-life.”

“From the clients’ perspective, the invisibility of the workers in microtasking is not a bug but a feature,” says Schmidt. Economically, because the tasks are so small, it's more feasible to deal with contractors as a crowd instead of individuals. This creates an industry of irregular labor with no face-to-face resolution for disputes if, say, a client deems their answers inaccurate or wages are withheld.

The workers WIRED spoke to say it’s not low fees but the way platforms pay them that’s the key issue. “I don't like the uncertainty of not knowing when an assignment will come out, as it forces us to be near the computer all day long,” says Fuentes, who would like to see additional compensation for time spent waiting in front of her screen. Mutmain, 18, from Pakistan, who asked not to use his surname, echoes this. He says he joined Appen at 15, using a family member’s ID, and works from 8 am to 6 pm, and another shift from 2 am to 6 am. “I need to stick to these platforms at all times, so that I don't lose work,” he says, but he struggles to earn more than $50 a month.

He is compensated only for time spent entering details on the platform, which underestimates his labor, he says. For instance, a social-media-related task may pay a dollar or two per hour, but the fee doesn’t account for the additional necessary research time spent online, he says. “One needs to work five or six hours to complete what effectively amounts to an hour of real-time work, all to earn $2,” he says. “In my point of view, it is digital slavery.” An Appen spokesperson said the company is working to reduce the amount of time spent in search of tasks, but the platform must strike a “careful balance” between providing clients with quickly completed tasks and contributors with a consistent workflow.

Fuentes is now on a Telegram group chat with other Venezuelan Appen workers, where they crowdsource advice and vent grievances—their version of a Slack channel or water-cooler-chat substitute. After seven years of completing tasks on Appen, Fuentes says she and her colleagues would like to be considered employees of the tech companies that they train algorithms for. But in AI labeling’s race to the bottom, years-long contracts with benefits are not on the horizon. In the meantime, she would like to see the industry unionized. “I would like them to consider us not just as work tools that can be thrown away when we are no longer useful but as human beings that help them in their technological advancement,” she says.

This story appears in the November/December 2023 edition of WIRED UK.