For those of you who don’t remember, it came to my attention in December of 2022 that people were Data Mining [DM] AO3 to help program writing AIs (that post is here), which ultimately led me to Archive-Lock Ghost in the NYC.
Recently, AO3 has posted an announcement updating users on the situation. You can read the article here, but here is a list of highlights:
- Common Crawl (the dataset that is used to train writing AI) did scrape the site in December 2022, and as much as AO3 wants to remove it’s content from that dataset, they can’t do so
- AO3 staff are on the
lookout for individual scrapers collecting AO3 data, and plan to take action as needed
- AO3 has put in technical measures to hinder large scale DM, including rate limiting, moderating cite traffic
- Code has been put into place to stop Common Crawl from scraping the cite again
- Staff recommend that for extra protection, users put their works in Archive Lock
- AI generated works are not against the current policy of AO3, but they could potentially violate the anti-spam policy that is in place. Staff encourage users to report to the Policy and Abuse team if they are unsure if a piece of work or a user is going against policy
- Staff are discussing changes to AO3′s policy about AI generated works and will make a public announcement if there are changes in it’s policy





