Boss Software Solutions
// data science project briefs · 2026
A real business system serving retail operations in real time — generating continuous, connected events across sales, customers, inventory, workforce and communications. You'll research real-world decision-making as it happens, not a benchmark assembled for a paper.
Boss Software Solutions runs the ERP, CRM and POS systems behind retail chains across Israel — the live systems generating all of the data above.
The character of the data is the real value: live, connected, sequential, and relational.
Hundreds of businesses operating right now. Models can be posed to run live — while a customer is shopping, mid-transaction — not only on yesterday's export.
Sales, CRM, inventory, workforce, tasks and communications joined in one environment. You can model a whole business, not a single isolated table.
The platform captures ongoing events, not isolated snapshots: purchases, inventory movements, employee actions, customer interactions and operational workflows.
Customers, products, categories, branches, employees, promotions and suppliers are richly interlinked — a real entity graph waiting to be explored.
Unlike most academic projects, you'll have access to a live production environment generating continuous operational events across sales, customers, inventory, workforce and business processes — enabling research on real-world decision-making, not static benchmark datasets.
Five solid starting points — with plenty of flexibility to define the scope together.
Predict purchase potential for products or categories, predict churn, and identify behavioral patterns over time — from long transaction history, profiles, loyalty clubs and promotions.
Hierarchical demand forecasting at the product × branch level from historical sales sequences. Applicable to inventory, ordering, supply and branch operations.
Detect data-entry errors, operational faults, abnormal usage and suspected fraud — both in the real-time transaction stream and in customer, employee or branch behavior over time.
Predict the next item or basket from prior purchase sequences. Meaningful value in recommendations, loyalty clubs and targeted campaigns.
Estimate how demand responds to price and promotion changes across products and categories — from real pricing, discount and sales history. A step toward causal, decision-oriented modeling.
The breadth also enables combined projects across sources — demand forecasting fusing sales, inventory and promotions; anomaly detection fusing transactions, employees and logs; customer-behavior models fusing purchases, campaigns, service and loyalty. Deep Learning over a whole business system, not a single source.
Because the system is live, several of these can be posed as real-time inference — "can the model run while a customer is shopping?" Think next-basket prediction, fraud detection, dynamic recommendations and live anomaly alerts.
Less conventional directions, made possible specifically by the connected, relational nature of this data.
Customers, products, categories, branches, employees, promotions and suppliers form a rich relational network. Explore link prediction, fraud rings, substitution & affinity structures, and GNN-based recommendation.
Combine transactions, call metadata, support tickets, tasks and operational events to predict customer outcomes or surface operational issues that no single source reveals on its own.
LLM-based reasoning over live operational data: natural-language access to ERP/CRM information and autonomous recommendations for business workflows. Ambitious and exploratory.
A broad operational environment from real retail businesses — not just POS data.
Item-level transactions within a basket: item, quantity, price, discount, time, branch, register, cashier, promotions.
Product hierarchies, categories, brands, prices and promotions — and how they change over time.
Stock, supplier orders, receipts, inter-branch transfers, shortages and inventory movements.
Purchase history, customer segments, loyalty traits, campaigns, benefits and return patterns.
Tasks, workflows, statuses, assignment to employees and branches, handling and completion times.
Working hours, attendance, branch assignment and operational performance over time.
Call data: times, durations, statuses, and association to branch, customer or process.
User actions, exceptional events, system processes and module usage data.
Raw, curated or ML-ready datasets — whichever fits the project. We've handled customer consent and data access for analytics before, so real data won't be a blocker.
Access to servers and dedicated GPU time, so you can train and run experiments at real scale — not only on a laptop.
Relevant API access, documentation, and hands-on guidance from me and the dev team — to help you understand the data, frame the problem, and unblock issues together.
We can hold a short intro session covering the system, the available data and the options — to help you choose a research direction.