OpenAI is reportedly asking contractors to upload real work from their previous jobs as part of its artificial intelligence training process, sparking debate among users over data ethics and more.
According to Wired, the ChatGPT manufacturer is currently working with training data firm Handshake AI, and has instructed contractors to submit authentic materials that they have manufactured.
Several reports suggested that these are likely to include spreadsheets, code repositories, images, documents, and more, and are recommended to remove proprietary and personal data ahead of submission, with the company providing a dedicated “Superstar Scrubbing” tool to assist with data cleaning.
The initiative is a part of the company's efforts to push toward higher-quality, domain-specific training data as AI systems advance beyond general capabilities.
Real professional work is visible as more effective for teaching models complex reasoning, communication, and decision-making skills in contrast to the publicly available data.
However, legal experts have raised concerns. Intellectual property lawyer Evan Brown informed Wired that the approach places strong dependence on contractors’ judgment to determine the confidential data, exposing AI-powered companies and contractors to legal risk.
Moreover, the move prompted users to raise concerns regarding consent, compensation, and the long-term impact on white-collar professions, as AI models trained on real work could eventually automate similar roles.
As regulators all across the globe raise scrutiny of AI data practices. OpenAI’s approach captures significant attention to how training data is sourced across the sector.