424 Companies

Access all companies

Find technology partners quickly.

Spotfolio tracks over one million companies in technology industries.

There are 424 companies in the field of Big Data on spotfolio that produce or deliver products, that utilize or research technologies or that are otherwise engaged in topics such as Big-Data, Data, Machine-Learning, Analytics, Apache, Cloud, Hadoop, Big, Apache-Spark .

Start-ups (48)

Early-stage companies (186)

Established companies (172)

The majority of these companies is located in the following countries.

United States (235)

India (31)

United Kingdom (26)

Germany (14)

Example Companies

Find companies in your area

Data Mechanics

start-up

Access

  • Technologies
  • Core data such as location, employees, revenues
  • Similar and peer companies
  • and more

Web Quotes

over Kubernetes or YARN, with a commercial service or using open-source Apache Spark. This is our first step towards building Data Mechanics Delight - the ...

/blog-post/how-to-be-successful-with-apache-spark-in-2021How to be successful with Apache Spark in 2021 Apache Spark is the leading technology for data ...

YARN. In this article, we will illustrate the benefits of Docker for Apache Spark by going through the end-to-end development cycle used by many of our ...

+ AI Summit, we went over the best practices and pitfalls of running Apache Spark on Kubernetes. We’d like to expand on that and give you a comprehensive ...

costs and performance over time. Tuesday, September 8, 2020 /blog-post/apache-spark-performance-benchmarks-show-kubernetes-has-caught-up-with-yarnApache ...

Spark Performance Benchmarks show Kubernetes has caught up with YARN Apache Spark on Kubernetes is as performant as Spark on YARN, including during shuffle

Apache Spark - the leading analytics engine for big data processing Over 1,000 contributors from 250 orgs Spark is the most popular open-source distributed ...

easier and versatile. Why choose spark? The advantages of working with Apache Spark Whether you’re working on ETL & Data Engineering jobs, machine learning ...

This makes it easy for developers from most backgrounds to easily adopt Apache Spark. 6. Easy to use In a few lines of code, data scientists and engineers ...

Mechanics bring to the table? Data Mechanics actively optimizes your Apache Spark workloads in your cloud account (AWS, Azure, or GCP) on a fully managed ...

/blog-post/how-to-be-successful-with-apache-spark-in-2021How to be successful with Apache Spark in 2021 Monday, November 2, 2020 Apache Spark is the leading technology ...

YARN. In this article, we will illustrate the benefits of Docker for Apache Spark by going through the end-to-end development cycle used by many of our

Ascend.io Ascension Labs

early-stage

Access

  • Technologies
  • Core data such as location, employees, revenues
  • Similar and peer companies
  • and more

Web Quotes

Integrations - Overview - Data Access Methods - Structured Data Lake - Apache Spark - Jupyter - PowerBI - Tableau - Zeppelin Administration - SSO ...

Integrations - Overview - Data Access Methods - Structured Data Lake - Apache Spark - Jupyter - PowerBI - Tableau - Zeppelin Administration - SSO ...

Developers - Quickstart - Introduction - Authentication Powered by Apache Spark Reading data into Databricks Spark using Structured Data Lake Suggest ...

Device_and_Weather_Analysis/K_Means_Cluster") Updated 3 months ago Apache Spark Reading data into Databricks Spark using Structured Data Lake Suggested

Integrations - Overview - Data Access Methods - Structured Data Lake - Apache Spark - Jupyter - PowerBI - Tableau - Zeppelin Administration - SSO ...

Integrations - Overview - Data Access Methods - Structured Data Lake - Apache Spark - Jupyter - PowerBI - Tableau - Zeppelin Administration - SSO ...

expr is true. Argument type Return type ( Bool ) Bool More ANY on Apache Spark Documentation SOME #some Description some(expr) - Returns true if ...

expr is true. Argument type Return type ( Bool ) Bool More SOME on Apache Spark Documentation BOOL_OR #bool_or Description bool_or(expr) - Returns ...

expr is true. Argument type Return type ( Bool ) Bool More BOOL_OR on Apache Spark Documentation BOOL_AND #bool_and Description bool_and(expr) - Returns ...

are true. Argument type Return type ( Bool ) Bool More BOOL_AND on Apache Spark Documentation EVERY #every Description every(expr) - Returns true

Cask Cask Data, Inc.

established

Access

  • Technologies
  • Core data such as location, employees, revenues
  • Similar and peer companies
  • and more

Web Quotes

the cloud with Dataproc, Google’s managed service for Apache Hadoop & Apache Spark. I’ll also show you how to deploy the pipeline to Data Fusion, the managed ...

set the execution engine to Spark so that the pipeline will later use Apache Spark for data processing when it runs on Dataproc. Deploy to Cloud Data Fusion ...

Dataproc profile. Dataproc will create an ephemeral cluster and use the Apache Spark processing engine to process the data. Conclusion Building data integration

to simplify testing; Support for cutting edge Cloud, Apache Hadoop and Apache Spark technologies. Hide Enterprise ready Metadata repository with automatic

Access

  • Technologies
  • Core data such as location, employees, revenues
  • Similar and peer companies
  • and more

Web Quotes

Association Studies with Apache Spark™, Delta Lake, and MLflow: To perform populate-scale Genome-Wide Association Studies with Apache Spark, the underlying storage ...

ACID transactions is Delta Lake. - Scaling Bioinformatics Methods with Apache Spark™: Parallelizing SAIGE Across Hundreds of Cores: To parallelize Genome-Wide ...

cores, the new Pipe Transformer tool integrates command-line tools with Apache Spark and Delta Lake. - Monitor Medical Device Data with Machine Learning using ...

Lake - Delta Architecture, a step beyond Lambda Architecture - Making Apache Spark™ Better with Delta Lake - Getting Data Ready for Data Science - Delta ...

for Data Lakes - Simplifying Streaming Analytics using Delta Lake and Apache Spark™ Spark+AI Summit Delta Lake Sessions At the Spark+AI Summit EU 2019 ...

Building Data Pipelines for Apache Spark™ with Delta Lake Training Session - New Developments in the Open Source Ecosystem: Apache Spark 3.0, Delta Lake, and

asked questions (FAQ) - Releases - Releases notes - Compatibility with Apache Spark - Resources Updated Aug 31, 2020 Contribute - Documentation - Releases ...

Compatibility with Apache Spark Permalink to this headline The following table lists Delta Lake versions and their compatible Apache Spark versions. Delta ...

Delta Lake version Apache Spark version 0.7.0 and above 3.0.0 and above Below 0.7.0 2.4.2 - 2.4.<latest>

Opcito

early-stage

Access

  • Technologies
  • Core data such as location, employees, revenues
  • Similar and peer companies
  • and more

Web Quotes

Resources - Blogs - Case Studies - Whitepapers - Newsletters Tag: Apache Spark Data ingestion with Hadoop Yarn, Spark, and Kafka June 7, 2018 ♥78

Use detailed technology terms.

Only spotfolio allows for searching with detailed technology terms, as we provide you with all topics that technology companies talk about.

Use Cases

Register now

Why Technology Scouting?

Technology scouting is the basis for your innovation management, identifies new technological developments as early as possible and protects you from disruptive business models.

Identify technology partners

Boost your market research

Look beyond major players

Build long and short lists

Collect technology leaders, providers, start ups and potential new partners in lists via simple tagging to take further steps with your team.