Some of the most valuable data is also data that is not easily shared. Distributed data science is a new technique for overcoming this issue by leaving data ‘in place’; sending algorithms to the data. This enables data scientists to extract value from these datasets while ensuring strict privacy and security guarantees can be upheld.
In this talk, we’ll briefly introduce the fundamentals of distributed data science, including federated machine learning with additional privacy measures. We’ll then show how a new, easy-to-use platform can be used to easily train models at scale on sensitive datasets. We will also run through example experiments showing how without such approaches we simply cannot train ML models of sufficient quality.
Some of the most valuable data is also data that is not easily shared. Distributed data science is a new technique for overcoming this issue by leaving data ‘in place’; sending algorithms to the data. This enables data scientists to extract value from these datasets while ensuring strict privacy and security guarantees can be upheld. In this talk, we’ll briefly introduce the fundamentals of distributed data science, including federated machine learning with additional privacy measures. We’ll then show how a new, easy-to-use platform can be used to easily train models at scale on sensitive datasets. We will also run through an example experiments showing how without such approaches we simply cannot train ML models of sufficient quality.