Mid-Senior Data Scientist (GenAI + Machine Learning)

Empresa não especificada • Lisboa

Tempo inteiro Outros

Descrição da Vaga

**About the Role** We are looking for a versatile Data Scientist to lead our AI initiatives, bridging the gap between classical Machine Learning and cutting\-edge Generative AI. You will contribute to the development of StarkVision \- our autonomous AI agent platform \- and maintain our core predictive models for Churn, Segmentation, Forecasting and others. This is a high\-impact role where you will not only build models but also architect complex multi\-agent systems that interact directly with our users and databases. **Key Responsibilities** 1\. Generative AI \& Autonomous Orchestration * Agent Orchestration: Design and optimize multi\-agent workflows. You will manage specialized agents (Database Admin, Report Agent, Cloud Specialist) to handle complex user requests. * Text to SQL: Enhance our Text\-to\-SQL capabilities, ensuring accurate translation of natural language into complex SQL queries using LLMs (GPT\-4, Gemini, Grok). * RAG \& Memory: Maintain and improve our long\-term memory systems using Vector Databases (ChromaDB) and RAG pipelines to provide context\-aware responses. * Tool Integration: Develop and maintain MCP (Model Context Protocol) tools that allow agents to interact with external services (OneDrive, SQL Databases, PDF parsers). 2\. Machine Learning \& Predictive Modeling * Churn Prediction: Maintain and improve our deep learning Churn Model built with TensorFlow/Keras (LSTMs). You will handle 3D time\-series data construction and model optimization. * Customer Segmentation: Refine our clustering pipelines using Scikit\-Learn and Faiss (for GPU\-accelerated clustering). Implement advanced feature selection techniques and hybrid clustering approaches (K\-Means \+ Hierarchical) * Forecasting: Manage time\-series forecasting modules for sales and demand prediction. 3\. Backend \& Deployment * Production Engineering: MDeploy models and agents within our Flask application, utilizing Redis and RQ (Redis Queue) for asynchronous background processing. * Data Engineering: Write efficient SQL queries and Python scripts (Pandas, SQLAlchemy) to preprocess large datasets for both ML training and agent consumption. 4\. Other * Company Workflows: Assist in requirements definition, making sure everyone is on the same page. **Requirements** * Bachelor's degree in computer science, Software Engineering, Computer Engineering, or related technical field. * 3\+ years of experience in Data Science or Machine Learning Engineering. * GenAI Expertise: Proven experience building LLM\-based applications. Familiarity with Agentic workflows and RAG architectures is a must. * Deep Learning: Hands\-on experience with Neural Networks, specifically LSTMs/RNNs for time\-series data (TensorFlow or PyTorch) * Strong Math/ Stat Foundation: Deep understanding of clustering algorithms, dimensionality reduction (PCA), and statistical forecasting. * Coding Skills: Expert\-level Python. Comfortable writing production\-ready code, unit tests, and working with APIs. * Database Skills: Strong SQL proficiency. You understand database schemas and can optimize queries. **Nice to Have** * Experience with Kubernetes and Helm Charts. * Knowledge of Cloud Services. * Experience with Text\-to\-SQL fine\-tuning or prompt engineering. * Familiarity with Faiss for large\-scale similarity search. * Knowledge of Model Context Protocol (MCP) * Knowledge of Agent2Agent Protocols (A2A) * Ability to handle ambiguity in complex requirements scenarios. * Experience with on\-prem solutions. **What We Offer** * A great environment with real world challenges. * Opportunity to work on high\-tech products integrating Analytics, AI and BI for Banking, Retail and Health. * A modern technical environment (Modern Python, automated workflows). * Hybrid work format. Tipo de oferta: Integral/Full\-time Benefícios: * Cartão/Ticket refeição Localização do trabalho: Presencial

Precisa de estar logado para se candidatar.