"I would say 80% of our focus is on data collection and organization, and that's really where Snowflake has helped us tremendously."
Singapore based Graas, short for 'Growth as a Service,' is an eCommerce solution provider that uses AI and data analytics to help brands grow their online businesses profitably.
"We collect data from multiple sources for eCommerce brands and sellers, aggregate it in one place, and run intelligence on top of it. The goal is to support decision-making at these brands," said Mangesh Panditrao, chief AI officer at Graas, in an exclusive interview with AIM.
With generative AI emerging as a disruptive technology, Graas has also integrated it into various operations. Panditrao explained that they use different regression models for forecasting sales, sales comparisons, and identifying anomalies.
Graas's sales forecasting model differentiates itself from competitors by utilising time series data and taking into account external factors that influence sales. "It is very different from typical time series forecasting because it involves knowing which exogenous factors to use. In our case, we don't just examine sales data, we also consider ad spend, inventory prices, and discounts to build a model that can accurately predict sales," explained Panditrao.
Furthermore, Graas now allows its customers to interact with their data using natural language, helping them gain deeper insights into its dynamics. Panditrao said that the initial versions of this feature are currently in beta testing.
Panditrao said that data collection is a critical aspect of their business, and the company is relying on Snowflake to support this effort. "I would say 80% of our focus is on data collection and organization, and that's really where Snowflake has helped us tremendously. We depend heavily on its flexibility to keep all this data in sync and ready for analysis."
"We moved to Snowflake about three to three and a half years ago, being one of the early movers. We recognised the need when our data started to balloon," said Panditrao.
Snowflake has enabled Graas to rapidly ingest data and generate analytics dashboards, offering customers real-time visibility into integrated business metrics across various platforms in one location. With this, they also gain actionable insights much faster, which allows them to take near real-time decisions. Grass leveraging features provided by Snowflake, such as Data Share, to share data directly with our customers in their own Snowflake instances.
He further said that Graas expanded its customer base by over 10x after migrating to Snowflake, showcasing the platform's capacity to support substantial growth efficiently. By implementing an optimised data loading strategy with Snowflake, Graas was able to reduce data processing costs even while scaling its customer base.
"The adoption of Snowflake's Snowpipe enabled Graas to deliver analytics in near-real-time, with analytics dashboards rendered in less than one minute after a fresh connection is made, thereby enhancing customer insights and decision-making speed."
Panditrao said that Grass has seen data volume increases of over 8x on sale days, and Snowflake has easily managed these surges. "With Snowflake's Snowpipe streaming capabilities, we now ingest data in real-time, reducing our previous 15-minute delay," he said.
Why Snowflake? Snowflake's Snowpipe allows for automated, near real-time data ingestion without manual scheduling or managing compute resources. It loads data in small batches, making it available for querying within minutes rather than hours. Snowpipe can ingest data from cloud storage services like Amazon S3, Google Cloud Storage, and Microsoft Azure Blob Storage.
Vijayant Rai, managing director of Snowflake India, in an exclusive interview with AIM said that it is quite easy for customers to use generative AI with their data without worrying about data security in Snowflake. "When you apply large language models or need to create applications, the generative AI or the LLM actually accesses the data. The data doesn't leave the platform," he said.
Rai shared that many large companies currently utilising Snowflake have their data stored on the platform, enabling them to perform real time analytics on it. He explained that these companies often possess a significant amount of legacy data, some of which may be 40 to 50 years old.
"They also work with new data coming in from various channels, whether structured or unstructured, including sources from the internet," he said. "Snowflake provides a secure destination where they can effectively manage and analyze all of this data."
Unlike Snowflake, Databricks is a unified analytics platform that excels in processing large datasets using Apache Spark. While Databricks offers capabilities for streaming data ingestion, its primary focus is on data engineering, machine learning, and collaborative analytics. This means that while it can handle real-time data, the setup and maintenance can be more complex compared to Snowpipe's straightforward approach.
"The main difference is that Snowflake was designed at its core to be a data warehouse solution, while Databricks was created to be an ML pipeline solution. Increasingly, the data world has merged these offerings, so that Amazon, Snowflake, and Databricks are all competing for the all-in-one solution," posted a user on Reddit.
However, Databricks' goal is currently to build a USB of AI, where vendors don't need to worry about where their data is stored. The company's recent acquisition of Tabular is a testament to that.
"We don't understand all the intricacies of the Iceberg format, but the original creators of Apache Iceberg do. So now, at Databricks, we have employees from both of these projects, Delta and Iceberg. We really want to double down on ensuring that UniForm has full 100% compatibility and interoperability for both of those," said Databricks chief Ali Ghodsi at Data+ AI Summit 2024.
Databricks announced the general availability of Delta Lake UniForm as well, which supports both Delta lake and Iceberg formats. Meanwhile, Snowflake recently announced Polaris Catalog, a vendor-neutral, open catalog implementation for Apache Iceberg, at its Data Cloud Summit held this year.
Some of the notable customers of Databricks are Adobe, AT&T, Block (Square, CashApp, Tidal), Burberry, Rivian, and U.S. Postal Service.
Ghodsi pointed out that the data estate of every company is placed into several data warehouses and the data is siloed everywhere. This ends up bringing a lot of complexity and huge costs to the companies and ultimately gets them locked into these proprietary system silos.
He explained that the idea was to let users own their data and store it in data lakes where any vendor can then plug their data platforms into that data, allowing users to decide which platform suits them best. This removes lock-in, reduces the cost, and also lets users get many more use cases by giving them the choice to use different engines for different purposes if they want.