Pepkor Corporate Centre

Cape Town
Posted 5 months ago


We are looking for a Senior Data Scientist that will help us define and solve real business problems and see solutions through to the end by applying state-of-the-art machine learning operations principles. Your primary focus will be to build and maintain data pipelines, machine learning solutions and produce bespoke analytics reports and dashboards using our group data access, cloud computing ecosystem and skills. You will form part of the go-to team for solving problems in marketing, sales, operations, planning, logistics and supply chain. 


  • Identify the right problems to solve by analysing the viability of business requests and ideas, in the predictive, prescriptive and cognitive analytics space
  • Support clients in the project definition stage by focusing on value creation
  • Conceptualise the right solution to generate maximum value
  • Find, clean and join the data necessary for the solution in a SQL + Python environment
  • Design and build the feature stores for model training 
  • Select and fit accurate and robust predictive models for each use case
  • Take the models to scale with the support of our ML Engineers
  • Design and build the metrics to monitor model stability and accuracy
  • Analyse models to make decisions on retraining frequency and triggering
  • Model and solve optimisation and decision support problems inside the production pipeline environment (SQL, Python)
  • Design and build production grade minimum viable products such as dashboards and published notebooks for business users, linked to the feature stores, scoring engines and optimisation models that you’ve built!


  • PhD degree or currently enrolled as a final year doctorate student in a quantitative field: Statistics, Mathematics, Computer Science, Engineering, Operations Research. This is an absolute requirement.
  • 3+ years’ of experience in predictive modelling and at least one of the fields mathematical programming, combinatorial optimisation or operations research 
  • Good applied mathematical statistics skills, such as regression analysis, hypothesis testing
  • Good understanding of a wide selection of predictive modelling, machine-learning, clustering and classification techniques, and algorithms such as linear and logistic regression, xgboost, neural networks, k.means, decision trees, random forests is a requirement
  • Experience in applying operations research techniques such as deterministic and stochastic optimisation models, metaheuristics, simulation modelling, dynamic programming, queueing theory and Markov chains would be advantageous
  • Fluency in a programming language (Python, SQL, R, C, Java)
  • A basic understanding of code repositories, git and CI/CD
  • Experience with common data science toolkits, such as R, SPSS, SAS, NumPy etc. 
  • Familiarity with Big Data frameworks (Hadoop, Spark, GCP, AWS, Azure)
  • Experience with data visualisation tools, such as D3.js, GGplot, Shiny, etc.
  • Competencies required:  Excellent problem assessment, decision making, communication, interpersonal skills, and a self-driven team player who is results driven, assumes ownership of the given task and is committed to excellence. 

CLOSING DATE: 31 May 2022

If you are interested in the above position and meet the requirements as indicated above, please submit your CV via e-mail to:   

If you do not hear from us within 4 weeks of the closing date of this position, please regard your application as being unsuccessful.

Pepkor strives for equal opportunity in terms of its employment equity guidelines

Job Features

Job CategoryData and Analytics

Apply Online

A valid email address is required.