Company-wise Questions

Real interview questions from 25+ companies

Ad
Google

Google

BigQuery, Dataflow, Beam, Pub/Sub, Bigtable, GCP pipelines

Explore
Amazon

Amazon

S3, Glue, Athena, Kinesis, Redshift, EMR, Lake Formation

Explore
Microsoft

Microsoft

Azure ADF, Synapse, Databricks, Spark, managed identity, DR

Explore
Apple

Apple

Spark vs MapReduce, Parquet, SQL, binary trees, ETL optimization

Explore
Meta

Meta

Product metrics, SQL, data modeling, A/B testing, streaming pipelines

Explore
Netflix

Netflix

Streaming schema, recommendations, SQL, Spark, reservoir sampling

Explore
NVIDIA

NVIDIA

GPU telemetry, Kafka, CDC, data quality, governance

Explore
Tesla

Tesla

Vehicle telemetry, SQL, algorithms, shortest path, manufacturing QA

Explore
Walmart

Walmart

Retail DWH, POS streaming, SCD, compliance, Spark at scale

Explore
Atlassian

Atlassian

Event pipelines, SQL attribution, graph modeling, Jira analytics

Explore
JP Morgan Chase

JP Morgan Chase

Spark, SQL, distributed systems, system design

Explore
Goldman Sachs

Goldman Sachs

SQL, Java, algorithms, data modeling, Spark

Explore
HSBC

HSBC

Real-time pipelines, GDPR, Spark, Azure, SQL

Explore
American Express

American Express

SQL, Python, Spark, data modeling

Explore
Flipkart

Flipkart

E-commerce data modeling, Spark, SQL, pipelines

Explore
Deloitte

Deloitte

Azure, ADF, Databricks, Delta Lake, PySpark

Explore
PwC

PwC

Azure ADF, Delta Lake, Spark, SQL, CI/CD

Explore
KPMG

KPMG

Databricks, ADF, ADLS, SQL, PySpark, Python

Explore
EY

EY

SQL, PySpark, cloud, data engineering

Explore
Infosys

Infosys

Azure, Databricks, ADF, Delta Lake, SQL

Explore
TCS

TCS

GCP, Azure, Python, Spark, SQL

Explore
Wipro

Wipro

Catalyst optimizer, AQE, skew joins, Data Vault, CDC vs full refresh

Explore
Cognizant

Cognizant

PySpark, ADLS, Azure SQL, Spark, SQL

Explore
Capgemini

Capgemini

Delta Lake, SCD, Data Mesh, Parquet, Lambda vs Kappa, cost optimization

Explore
EPAM

EPAM

Spark internals, Delta Lake, GCP, SQL

Explore
Publicis Sapient

Publicis Sapient

System design, SQL, clickstream analytics

Explore
Persistent

Persistent

PySpark, ADF, Synapse, SQL, Azure DevOps

Explore
Virtusa

Virtusa

Lakehouse, CDC, schema drift, window functions, columnar storage

Explore
Coforge

Coforge

ETL vs ELT, surrogate keys, PolyBase, Medallion, event-driven ingestion

Explore
Amdocs

Amdocs

Telecom CDR pipelines, CDC in Azure, Delta Lake, ADF error handling

Explore
Accenture

Accenture

Cloud pipelines, SCD types, DLT, Unity Catalog, CI/CD for data

Explore
Tata Digital

Tata Digital

SQL, PySpark, GCP, BigQuery, Airflow

Explore
Tredence

Tredence

Spark, Airflow, SQL, Python, system design

Explore
Fractal

Fractal

GCP, Azure, PySpark, Airflow, Databricks

Explore
Quantiphi

Quantiphi

GCP, BigQuery, Spark, ML pipelines

Explore
Nielsen

Nielsen

GCP, Spark, SQL, Python, Airflow

Explore
Latentview

Latentview

ADF, ADLS, Synapse, Azure DevOps, Python

Explore
EXL

EXL

SparkSession, ADLS, Stream Analytics, SQL

Explore
Synechron

Synechron

Spark execution, Airflow, Hive, SQL

Explore
Datametica

Datametica

PySpark, SQL, nested JSON, ETL

Explore
Saama Technology

Saama Technology

PySpark, Spark optimization, SQL

Explore
HCL

HCL

PySpark coding, deduplication, aggregations

Explore
NTT DATA

NTT DATA

SQL, Spark, Hadoop, Python, ETL fundamentals

Explore
AppZen

AppZen

SQL: self-joins, running totals, deduplication

Explore
66 Degree

66 Degree

Python, Spark, BigQuery, GCP

Explore
AP Moller - Maersk

AP Moller - Maersk

PySpark, SQL, data pipelines

Explore
Deutsche Bank

Deutsche Bank

SQL, Spark, Python, data modeling

Explore