Solving Data Management Challenges in Third-Party Logistics (3PL) Spaces – with Vladimir Gofaizen of Wineshipping

Riya Pahuja

Riya covers B2B applications of machine learning for Emerj - across North America and the EU. She has previously worked with the Times of India Group, and as a journalist covering data analytics and AI. She resides in Toronto.

Solving Data Management Challenges in Third-Party Logistics Spaces@2x

This interview analysis is sponsored by NLP Logix and was written, edited, and published in alignment with our Emerj sponsored content guidelines. Learn more about our thought leadership and content creation services on our Emerj Media Services page.

Logistics and adjacent industries like manufacturing are not traditionally thought of as spaces conducive to fast technological adoption. According to reporting from Spanish-based logistics research firm Unologistica, 23% of companies in the logistics and transportation sector are utilizing Big Data to optimize their operations. In contrast, just 9.6% have integrated artificial intelligence (AI) into their workflows, mainly because many believe the technology still needs to mature enough for widespread adoption.

A 2022 McKinsey survey indicates that the most significant cost savings from AI are seen in supply chain management. AI delivers substantial benefits in areas like supply chain planning, such as production, inventory management, and product distribution.

Additionally, companies can harness AI-driven tools to analyze large volumes of real-time data, enhancing the precision of demand forecasts, according to sources cited by Georgetown University.

These efficiencies are especially felt in third-party logistics spaces, where actors may have conflicting incentives. However a study from Penn State University, Penske Logistics, and NTT DATA from earlier this year reported that, despite ongoing challenges like geopolitical unrest causing delays, most shippers (89%) and 3PLs (94%) report successful partnerships.

Furthermore, the Penn State-led study notes that shippers benefit from improved customer service (82%) and some logistics innovations (68%), with a notable increase in outsourcing, as 87% of shippers have expanded their use of 3PLs. The role of technology in supply chain innovation is crucial, with shipper satisfaction in 3PL IT capabilities rising to 87% and top priorities including control tower visibility and execution-based technologies like transportation and warehouse management systems.

Emerj Senior Editor Matthew DeMello recently sat down with Vladimir Gofaizen, Director of IT Engineering of Wineshipping, to talk about how his organization streamlined data access, improved efficiency, and reduced costs by centralizing data, optimizing processes, and enabling AI-driven tools for more accessible data interaction.

Wineshipping is a leading logistics provider specializing in the storage, transport, and fulfillment of wine and alcoholic beverages in the United States.

From their conversation, this article articulates the following two key insights for logistics and data leaders:

  • Implementing a flexible, scalable data infrastructure: Adopting a flexible data platform that separates compute from storage to enable rapid scaling, real-time insights, and stakeholder visibility across all systems, reducing reporting costs and boosting decision-making efficiency.
  • Enabling self-service AI to increase efficiency and reduce costs: Using AI tools to automate query correction and streamline dashboard creation for non-technical stakeholders while optimizing data processing to reduce redundant compute usage, enabling faster access to accurate data and lowering operational costs.

Guest: Vladimir Gofaizen, Director of IT- Engineering, Wineshipping

Expertise: Enterprise Application Development, Software Architecture, Application Security

Brief Recognition: Vladimir Gofaizen is the Director of IT Engineering at Wineshipping, where he leads cross-functional teams with 20+ contributors in software engineering, data engineering, and DevOps areas to scale operations. He has 20 years of experience in enterprise application development and 10 years of experience leading software projects. He earned his Masters in Business Analytics from Georgia State University. 

Implementing a Flexible, Salable Data Infrastructure

Vladimir opens the conversation by talking about the significant challenges his company faced due to fragmented data systems in Wineshipping’s operations. Over the years, they accumulated five separate data warehouses created by different vendors and executives with varying approaches. These data warehouses collected information from various systems—warehouse management, order management, ERP, HR, and others—often due to acquisitions or leadership changes. 

That framework led to data silos where data was not integrated, causing confusion and inefficiency. With various system deployments, stakeholders ended up viewing different data sets, complicating reporting and increasing costs, as users expected centralized data access but found data scattered across systems. Consequently, they were not able to seize critical opportunities to leverage cross-system data for both customers and internal users:

“Well, for us, combining it into a single data platform was just the foundation. The longer term strategic objective was to enable aggregating data from more systems and gluing it together quicker.

For a purpose like that, if we just hired a data engineering team working in a specific data platform that is not collaborative, that is more like a standard, legacy data warehouse. It would not be very sustainable. We also needed to democratize the data and enable our stakeholders to participate and provide visibility into the data engineering process earlier, so all of these capabilities don’t really exist in the legacy data warehouse systems.”

–Vladimir Gofaizen, Director of IT Engineering at Wineshipping

He then outlines multiple objectives in his data strategy:

  • Lift-and-Shift Consolidation:
    Initially, they centralized data from across the enterprise, not only from data warehouses but also from spreadsheets, documents, and SaaS systems scattered across the organization.

    Prior to this, executives and departments had limited visibility, with each department relying on isolated systems, such as carrier coordination and seller coordination, that needed to be shared. The “lift and shift” moved all this data to one unified platform.
  • Expanded Data Integration:
    Next, they incorporated more essential systems into the platform, adding previously inaccessible data sources for a more comprehensive data foundation.
  • Analytics and Dashboarding:
    They aimed to create scalable analytics and dashboards that could serve different needs—whether for internal teams, customer-facing APIs, or even advanced applications like AI. This flexibility allowed them to tailor data access as needed for various stakeholders.
  • Direct System Integration:
    Rather than using an integration platform, they leveraged their data platform (Databricks) to build direct data feeds to other systems. For example, they created an integration with AfterShip, which tracks shipment statuses across carriers.

    By using their data platform, they connected tracking data from over 90 carriers, allowing updates to flow directly, quickly, and reliably. This integration served both analytics and real-time tracking, ensuring that data always reached the right system and destination efficiently.

Vladimir further explains the engineering factors behind their choice of a modern data platform, highlighting two key considerations: scalability and stability.

  • Scalability:
    Modern data systems allow them to scale compute resources (CPU and RAM) separately from storage, providing flexibility that legacy systems lack. For instance, they can create multiple temporary clusters to ingest vast amounts of data (several terabytes) in minutes by briefly increasing spending—an approach that legacy systems couldn’t support.

    They can also dedicate different clusters for distinct purposes: one for data ingestion and another for serving data to APIs, ensuring that scaling one doesn’t impact the other. Additionally, Databricks’ SQL clusters have auto-scaling, adjusting resources as needed without interfering with data ingestion.
  • Stability:
    This framework ensures a stable environment where ingesting or processing large data volumes does not disrupt other activities. Together, scalability and stability made this modern platform a solid choice for their needs.

Enabling Self-Service AI to Increase Efficiency and Reduce Costs

At first, Wineshipping set forth to deliver a modern data platform and migrated the five data warehouses to Databricks. This transition provided Wineshipping with a unified view of their data and enabled users to access enterprise data easily. The transition was a lift and shift; however, additional careful planning and design were needed to increase efficiency and reduce costs. 

Vladimir also describes their plan to replace traditional Power BI analytics dashboards with AI-generated dashboards to improve efficiency and accelerate output. While AI is commonly used for tasks like predictions and support systems, they are using it to speed up dashboard creation and collaborative engineering.

Their approach leverages AI within Databricks to create context-aware dashboards that pull in data from multiple systems. Instead of stakeholders approaching the data engineering team with raw ideas, they can ask AI to create prototypes using simple language. For example, they might request a dashboard that includes sales data, order data, and a forecast, which the AI can quickly generate. This allows stakeholders to bring a working prototype to the data engineering team for production, cutting down on back-and-forth discussions and initial requirement gathering.

Over the next year, they aim to use AI more heavily to produce data engineering assets like dashboards, moving towards an AI-driven approach for faster and more flexible analytics development.

“So we are also using AI for production, troubleshooting, and support. I’m not sure how many companies do this. We found in Databricks that there is a set of tooling that fixes your queries. Since all of the data from all systems are in one place, it fixes cross-system queries.

So you would type in something that you might want to get from some tables with a bunch of typos in it and click the button that says ‘diagnosis.’ Then, it will fix your query, which enables semi-technical people to become very efficient at work, which was previously only possible for data engineers. 

In our company, a lot of stakeholders are starting to pick it up. It is surprising. Once the stuff becomes easy, they don’t need to worry about minor typos or understanding the complexities of SQL. Even people that are not really that technical are able to accelerate because AI just fixes it for you, and they can get to the data they need for live troubleshooting, including emergency situations or urgent requests for data from our customers.”

-Vladimir Gofaizen, Director of IT Engineering at Wineshipping

Towards the end, he explains the importance of partnering with external experts, especially as companies face growing pains with large data platforms like Databricks or Snowflake. For Wineshipping, managing 15 terabytes of data that must be accessible across multiple systems in near real-time posed several challenges, including performance, disaster recovery (DR), and best practices. 

Internally, it would be nearly impossible to tackle these issues alone, so they partnered with NLP Logix. NLP Logix ran a Databricks assessment and provided a comprehensive assessment report and an executive overview to remediate some of the areas in order for Wineshipping to reduce cost, improve performance, and create repeatable patterns for the future. For 8 weeks, NLP conducted a comprehensive assessment of their Azure Databricks environment and identified inefficiencies and ways to optimize performance cost. 

NLP Logix’s critical findings for Wineshipping included:

  • Redundant data processing: All data was processed every day instead of just the new data, which led to unnecessary compute usage. NLP Logix recommended processing only the latest data and integrating it with the existing dataset. 
  • Need for Enterprise Data Model (EDM): to simplify the data architecture that exceeded the basic medallion architecture.
  • Business continuity: NLP Logix identified that a terraform solution could be implemented to provide disaster recovery for the data platform .

Wineshipping has plans to continue expanding the use of Azure Databricks, including the potential for future AI/ML use cases within the environment. Utilizing Azure Databricks for these AI/ML use cases will ensure the needed data is readily available and scales accordingly for production volume processing.

Subscribe
subscribe-image
Stay Ahead of the Machine Learning Curve

Join over 20,000 AI-focused business leaders and receive our latest AI research and trends delivered weekly.

Thanks for subscribing to the Emerj "AI Advantage" newsletter, check your email inbox for confirmation.