Big Data (python/scala) Engineer -assistant Vice President

Citi

Tampa, Florida, United States
Base: $96,960.00 - $145,440.00; bonus/equity: disc...
Spark core, spark sql, spark streaming
Expert-level scala programming
Spark performance optimization
Our technology solutions are the foundations of everything we do from keeping the bank safe, managing global resources, and providing the technical tools our workers need to be successful to designing our digital architecture and ensuring our platforms provide a first-class customer experience

Job Summary

  • Our technology solutions are the foundations of everything we do from keeping the bank safe, managing global resources, and providing the technical tools our workers need to be successful to designing our digital architecture and ensuring our platforms provide a first-class customer experience.
  • We reimagine client and partner experiences to deliver excellence through secure, reliable, and efficient services.
  • Citi offers competitive employee benefits, including: medical, dental & vision coverage; 401(k); life, accident, and disability insurance; and wellness programs.

Matching Summary

Our technology solutions are the foundations of everything we do from keeping the bank safe, managing global resources, and providing the technical tools our workers need to be successful to designing our digital architecture and ensuring our platforms provide a first-class customer experience.

Salary

Base: $96,960.00 - $145,440.00; Bonus/Equity: discretionary and formulaic incentive and retention awards; Benefits: medical, dental & vision coverage; 401(k); life, accident, and disability insurance; and wellness programs.

Skills & Requirements

Must-have

  • Spark Core, Spark SQL, Spark Streaming
  • Expert-level Scala programming
  • Spark performance optimization
  • Deploy Spark on YARN, Mesos, Kubernetes
  • Advanced HiveQL queries
  • Hive metastore, execution engines
  • Object-oriented and functional programming
  • Scala build tools (SBT, Maven)
  • PySpark for data processing
  • Data manipulation libraries (Pandas, NumPy)
  • Complex query writing, subqueries, window functions
  • HBase or Cassandra/MongoDB
  • RDBMS concepts and SQL
  • Dimensional modeling, star/snowflake schemas
  • Data Ingestion Tools (Sqoop, Flume, Kafka)
  • Workflow Orchestration (Oozie, Airflow)
  • Cloud Platforms (AWS, Azure, GCP)
  • Version Control: Git
  • CI/CD tools (Jenkins, GitLab CI)
  • Monitoring and Logging (ELK, Grafana, Prometheus)
  • Agile/Scrum methodologies
  • Shell Scripting

Nice-to-have

  • Spark GraphX beneficial
  • Systemically responsible banking
  • First-class customer experience
  • Authentic selves to work
  • Problem solver with passion

Key Requirements

  • 5-8 years of relevant experience
  • Bachelor’s degree/University degree or equivalent experience
  • HDFS architecture, data storage, fault tolerance
  • YARN resource management
  • MapReduce programming paradigm
  • Zookeeper for distributed coordination
  • Common Scala libraries and frameworks
  • Data orchestration scripting
  • Query performance tuning
  • RDBMS concepts and SQL
  • Dimensional modeling concepts
  • Cloud experience (EMR, S3, Glue, Lambda, HDInsight, Data Lake, Databricks, Dataproc, BigQuery)
  • CI/CD experience
  • Agile Development familiarity

Work Rights

Not specified

Tailored Resume

Cover Letter