It was August 2013 when the biggest data breach in the history of internet started, but no one would know it until three years later. In September 2016, Yahoo! notified the public that 500 million user accounts had been breached in a 2014 cyberattack. Three months later, they came forward again, saying that another breach took place, the one that had occurred in 2013. Initially believed to have affected 1 billion user accounts, it was later discovered that 3 billion of its user accounts were impacted, making it a breach of unprecedented size. The event had a considerable impact in the public perception on the importance of data protection, as well as in the general increment of protective measures against cyberattacks by companies worldwide.
With the rise of digitalization, in fact, an increasing number of companies have relied on digital solutions to conduct their operations and store their data. All this digital information must be protected from unauthorized users, especially considering that hacking methods have become increasingly sophisticated, and data at risk increasingly valuable from a qualitative and quantitative point of view: avoiding cyberattacks or data breaches and protecting computer infrastructures has become top priority for companies, with worldwide spending on cybersecurity raising from $34B in 2017 to $60B in 2021.
While more and more security vulnerabilities are emerging, research in the field of cyber security is expanding, especially in the private sector.
Personal data ownership
Among the new players in data security, some startups are developing methods that use the Blockchain to return ownership of data to consumers and create new monetization opportunities. Through this model, data passes from centralized platforms to users, who gain control over their personal information and can therefore decide how and when to use it.
Some of the startups that stand within this trend are:
- Ecosteer: Italian SaaS provider that allows to control data visibility at its point of origin (i.e. any IoT device). Their system allows data owners to unilaterally manage consent and retract to different kinds of usage.
- Recheck: based in the Netherlands, it helps clients leverage decentralized tech, create a digital identity for their products, and verify the history and data integrity of their items.
- Invisibly: from Missouri, USA, it allows users to unlock paywalled and ad-free content from publishers they love by sharing data, for example answering surveys and inviting friends.
Passwordless authentication
Another trend that is on the rise is passwordless authentication. Since more than 60% of data breaches can be attributed to weak or reused passwords, this technology has gained ground in recent years as an answer to the security flaws related to traditional authentication methods.
Invisible multi-factor authentication (MFA) could drive the mass use of passwordless identity verification, an automated and seamless authentication process that could help prevent cyber-attacks.
Some of the startups that are working in the next generation MFA offering similar solutions are:
- HYPR: authenticate employees and consumers directly through their devices, such as smartphones and laptops, instead of using passwords.
- Auth0: adaptable authentication and authorization platform, they have recently combined forces with Okta to build new customer identity access manager (CIAM) solutions.
- Yubico: hardware authentication through a USB key called YubiKey as a combinable solution for two-factors and MFA.
Synthetic data for privacy
As customers and users become more and more aware of their privacy rights, some companies are currently experimenting with synthetic datasets to enable data sharing and collaboration.
Various organizations are turning to synthetic datasets especially in industries where there is not enough real-world data to train AI, or where the information is sparse, since this technology relies on information that’s artificially generated by algorithms rather than produced by real-world events. This process is particularly useful in protecting customers’ information, since it allows to generate data that sidesteps the privacy issues that would arise in using real consumer data.
Although mimicking real-world data poses several challenges, even large companies such as JP. Morgan and Telefónica are betting on this technology, relying on promising startups that have emerged on this field.
An interesting example of how relationships between corps and small companies are developing within this trend is the case of Gretel, a startup that recently partnered with biotech company Illumina to use real genotype and phenotype data to train an AI algorithm to generate artificial genomic data, thus managing to create an employable dataset while protecting the sensitive information of Illumina’s customers.
Other startups that work in synthetic data development to keep an eye on:
- Anyverse: design, train, validate or test the data for your AI perception systems. It enables the virtual generation of myriads of scenarios for a use-case based on high-fidelity, physics-based simulation tech.
- Syntho: it provides a “self-service” platform to connect and integrate the datasets to generate and analyze synthetic data.
- Datomize: provider of a synthetic data for the development, training and testing of AI/ML models and applications.
Fully Homomorphic Encryption (FHE)
A new encryption protocol that has revealed particularly helpful to companies that want to protect large amounts of private data is homomorphic encryption, which allows users to work on encrypted data without decrypting it.
Through this process, data access can be separated from data processing: the information remains encrypted through its life cycle and can be safely outsourced, thus avoiding the need of performing additional privacy protocols when working with third parties or in untrusted environments.
Among the companies that work in FHE solutions, some worth-mentioning are:
- Zama: their software enables web2 and web3 developers to use FHE without the need to learn cryptography.
- Ravel: their work is focused on making FHE fully scalable and faster than the current state of the art, apparently with impressive results.
- Dualitytech: securely share and analyze sensitive data while maintaining it fully encrypted throughout processes. Launched an open-source fully homomorphic encryption (FHE) library, OpenFHE, in July 2022.
Anonymizing personally identifiable info (PII)
Finally, let’s talk about anonymizing personally identifiable info (PII). This technology allows to make it more difficult (or even impossible) to identify an individual in a dataset through a set of processes and methods applied to it. With this PII barrier, it becomes essential to know the data in detail to be able to reach individual information.
Anonymizing large datasets without the need for manual data entry and without human supervision is certainly useful to protect business personnel and corporate data, especially in large companies: the larger the environment, the higher the risk of storing unidentified and unorganized PII data. That is why many companies are increasingly relying on AI-and ML-driven processes and the previously-mentioned synthetic datasets in order to secure their information while retaining its value to the organization.
One example of a company that automates data anonymization is Hazy, which focuses entirely on AI-generated smart synthetic data and has worked to make it both private and representative.
Among the startups working in PII aonymization, Anonos has raised $50 million in growth funding from Aon and GT Investment Partners. Its Data Embassy solution uses a combination of legal pseudonymization, synthetic data and other computing techniques to protect customer identities by decoupling personal data from direct and indirect identifiers.
Outside the EU and the US, the South Korean AI data company INFINIQ is expanding in the global market with the launch of its Wellid data anonymization solution, which automatically detects and anonymizes all identifiable faces and license plates in captured videos and images.
This was just a short list of trends in data security.
Evidently, there are many more fields and sub-categories to explore and many more promising startups that are putting their efforts in this context: we’ll keep on scouting new technology an companies in the field of data security and cyber security to keep you updated on the growth of this field.