top of page
Writer's pictureInno-Thought Team

CERN Create Natural Language Processing (NLP) Datasets in Particle Physics with Applications in AI

Vectorspace AI & CERN Create Natural Language Processing (NLP) Datasets in Particle Physics with Applications in Artificial Intelligence (AI) for Every Industry



Vectorspace AI and CERN, the European Organization for Nuclear Research and the largest particle physics laboratory in the world, are creating datasets used to detect hidden relationships between particles which have broad implications across multiple industries. These datasets can provide a significant increase in precision, accuracy, signal or alpha and for any company in any industry.


For commercial use, datasets are $0.99c per minute/update and $0.99c per data source, row, column and context with additional configurations and options available on a case by case SaaS/DaaS based monthly subscription. Over 100 billion unique and powerful datasets are available based on customized data sources, rows, columns or language models.


While data can be viewed as unrefined crude oil, Vectorspace AI produces datasets which are the refined 'gasoline' powering all Artificial Intelligence (AI) and Machine Learning (ML) systems. Latest research suggests "The Next Big Breakthrough in AI Will Be Around Language" – HBR.


Datasets are algorithmically generated based on formal Natural Language Processing/Understanding (NLP/NLU) models including OpenAI's GPT-3, Google's BERT along with word2vec and other models which were built on top of vector space applications at Lawrence Berkeley National Laboratory and the US Dept. of Energy (DOE). Over 100 billion different datasets are available based on customized data sources, rows, columns or language models.


Datasets are real-time and designed to augment or append to existing proprietary datasets such as gene expression datasets in life sciences or time-series datasets in the financial markets. Example customer and industry use cases include:


Particle Physics: Rows are particles. Columns are properties. Used to predict hidden relationships between particles.


Life Sciences: Rows are infectious diseases. Columns are approved drug compounds. Used to predict which approved drug compounds might be repurposed to fight an infectious disease such as COVID19. Applications include processing 1500 peer reviewed scientific papers every 24hrs for real-time dataset production.


Financial Markets: Rows are equities. Columns are themes or global events. Used to predict hidden relationships between equities and global events. Applications include thematic investing and smart basket generation and visualization.


Data provenance, governance and security are addressed via the Dataset Pipeline Processing (DPP) hash blockchain and VXV utility token integration. Datasets are accessed via the VXV wallet-enable API where VXV is acquired and used as a utility token credit which trades on a cryptocurrency exchange.



 

About Vectorspace AI


Vectorspace AI science and technology originated in Life Sciences and currently focuses on context-controlled NLP/NLU (Natural Language Processing/Understanding) and feature engineering for hidden relationship detection in data for the purpose of powering advanced approaches in Artificial Intelligence (AI) and Machine Learning (ML).

 

SOURCE: Vectorspace AI

1 view0 comments

Commentaires


connexion_panel_edited.jpg
CXO_8-in-1.png
subscribe_button.png
bottom of page