WebRobot and the Future of Big-Data

massive scraping dedicated spark clusters

By Denis Giuffrè - WebRobot Marketing About WebRobot Data Analytics August 30, 2020

WebRobot wants to become the hub, the reference point for all web scraping specialists drawing on the most innovative technologies to provide, at all times, an affordable and high-quality big-data service.

The Mission

Our goal is to become a complete ETL (Extract, Transform, Load) service based on cloud computing and big data, involving data extraction, web mining, machine learning, and big-data analytics.

In the future, we see WebRobot in the centre of the data supply chain, with the most powerful extraction engine, a comprehensive ETL system to get and deliver data from and to the Web, IoT, Drones, Smart Manufacturing, Smart Cities, and any other devices. Our purpose is to provide the right tools to improve business and life in the most efficient and scalable way.

The Vision

We believe technology is an opportunity and not a threat. We want to be protagonists of the upcoming revolution: a deep socio-economic environment change that will be driven by technology.

We’ve been working with our team and through strategic partnerships, to allow all our stakeholders to reach financial freedom through our services and business model.

Thanks to a unique B2B2X approach and a profit share scheme applied in the data industry, we are building a complete ecosystem where any parties involved can improve their business, products, and services, and monetize their effort and investment.

The Technological Context

Data analytic tools like AWS Athena and NoSQL databases like DynamoDB guarantee persistence. As for the headless browser technology, our main choice remains the excellent PhantomJS, although we will be open to integration with the most recent headless Chromium.

The wrapper induction algorithms are currently exposed by an internal API on the AWS Elastic Beanstalk context. They will be progressively integrated into the acquisition framework on Spark technology and Java / Scala languages. Visual support tools on Node.Js support the less experienced users in defining the ETL.

Our ETL is developed using parser generation technologies (ANTLR) and will be constantly evolving to converge towards our complete ETL vision.

Future research into decentralized contexts such as DFINITY or IEXEC technologies will uncover new horizons for deploying. We will be constantly open to the latest paradigms that DLT technologies will be able to bring out.

Denis Giuffrè - WebRobot Marketing / About Author

Design and Business Communication with 23 years of experience in marketing, business communication, writing, content creation, customer care, sales, and design. Denis Giuffrè manages the blogs, articles, presentations, documentation, landing pages, websites, and graphics assets of WebRobot Ltd and its vertical projects. He also supports the content creation of other e-commerce sites and blogs. Furthermore, Denis Giuffrè formerly worked for major telecommunication companies (Sky TV, Wind, Fastweb, and Mediaset).

More posts by Denis Giuffrè - WebRobot Marketing

WebRobot and the Future of Big-Data

The Mission

The Vision

The Technological Context

Related Posts

Leave a comment Cancel reply