SocialArks is a super-fast developing and well-known Chinese social media management company that works towards the problem solving of China’s foreign trade industry regarding marketing, brand building, etc. This company faced a massive data leak of more than 400GB of scraped data that led to the exposure of a huge amount of personal details of various users, influencers, and prestigious celebrities. Scraped data is the data collected during data scraping (it is the process of extracting private data from social media or a website).
The reason behind this data leak is a wrongly configured ElasticSearch database by SocialArks, which mainly contained PII (Personal identifiable information) of users from popular social media platforms i.e. Facebook, Instagram, and popular business network LinkedIn. This data leak was brought to light by Safety Detectives (in a blog post) during a cyber search of finding bugs or potential risks that can be used as a threat to the public.
The researchers were on a routine IP address check to discover unsecured or misconfigured databases when they discovered this company’s server which wasn’t encrypted nor required any password, was exposed publicly, and it contained data records of approximately more than 318 million users (included personal sensitive information) making a total of 408GB data. The influenced server, facilitated by Tencent, was fragmented into records to store the information acquired from every web-based media source, which permitted scientists to investigate the information further.
The researchers were able to conclude that the complete leaked data was scraped data of social media users, is considered immoral and also violates the terms of service of Instagram, Facebook, and LinkedIn. This scraped data discovered by researchers consisted of 81,551,567 profiles of Facebook users, 66,117,839 profiles of LinkedIn users, 11,651,162 profiles of Instagram users, and additional 55,300,000 Facebook profiles that were deleted after the vulnerability of the server was discovered.
Earlier around August, SocialArks had suffered a data breach and the common thing researchers noticed was leaking data of the same users in both data leaks, however, differences lied in the size of the database, hosting servers, etc. This leaked data of hundreds of millions of users included their names, profile pictures, the total number of followers, biographies, generally used hashtags, messenger ID, location details (country in most cases and detailed address in some cases), contact details including emails (of more than 10 million users) and phone numbers (of more than 6 million users), total comments, user tags, domain names, company names (also revenue margins), employment status with job title and position, etc.
Due to the rapid growth in online services, data scraping has now become quite common but still, it remains unethical and illegal if the user had agreed to the terms and conditions of the platform. Generally, most of the data scraping is not meant to be harmful or affect the general public, it is done mainly for business purposes to analyze user’s choices in the marketplace. But the fact can not be ignored that in any conditions if the scraped data is not protected or encrypted or unauthorized access is allowed, sooner or later it will be subjected to data leaks or breaches, impacting millions of users. Safety Detectives highlights a few points that any particular user should keep in mind to avoid being part of data leaks or victim to them.
Suggestions are that one should be careful about what info to share and what isn’t necessary to share, making sure to use secure websites that include a lock icon or HTTPS, setting up strong passwords, only open or click emails that are from a trusted source, hiding social media profiles from public and making it visible to only the trusted close ones and avoiding the use of payment card information or passwords outside your personal network.