Gretel announces $12M Series A to make it easier to anonymize data

As companies work with data, one of the big obstacles they face is making sure they are not exposing personally identifiable information (PII) or other sensitive data. It usually requires a painstaking manual effort to strip out that data. Gretel, an early stage startup, wants to change that by making it faster and easier to anonymize data sets. Today the company announced a $12 million Series A led by Greylock. The company has now raised $15.5 million.

Gretel founder and CEO Alex Watson says that his company was founded to make it simpler to anonymize data and unlock data sets that were previously out of reach because of privacy concerns.

“As a developer, you want to test an idea or build a new feature, and it can take weeks to get access to the data you need. Then essentially it boils down to getting approvals to get started, then snapshotting a database, and manually removing what looks like personal data and hoping that you got everything,”

Watson, who previously worked as a GM at AWS, believed that there needed to be a faster and more reliable way to anonymize the data, and that’s why he started Gretel. The first product is an open source, synthetic machine learning library for developers that strips out personally identifiable information.

“Developers use our open source library, which trains machine learning models on their sensitive data, then as that training is happening we are enforcing something called differential privacy, which basically ensures that the model doesn’t memorize details about secrets for individual people inside of the data,” he said. The result is a new artificial data set that is anonymized and safe to share across a business.

The company was founded last year, and they have actually used this year to develop the open source product and build an open source community around it. “So our approach and our go-to-market here is we’ve open sourced our underlying libraries, and we will also build a SaaS service that makes it really easy to generate synthetic data and anonymized data at scale,” he said.

As the founders build the company, they are looking at how to build a diverse and inclusive organization, something that they discuss at their regular founders’ meetings, especially as they look to take these investment dollars and begin to hire additional senior people.

“We make a conscious effort to have diverse candidates apply, and to really make sure we reach out to them and have a conversation, and that’s paid off, or is in the process of paying off I would say, with the candidates in our pipeline right now. So we’re excited. It’s tremendously important that we avoid group think that happens so often,” he said.

The company doesn’t have paying customers, but the plan is to build off the relationships it has with design partners and begin taking in revenue next year. Sridhar Ramaswamy, the partner at Greylock, who is leading the investment, says that his firm is placing a bet on a pre-revenue company because he sees great potential for a service like this.

“We think Gretel will democratize safe and controlled access to data for the whole world the way Github democratized source code access and control,” Ramaswamy said.


By Ron Miller

Mozart Data lands $4M seed to provide out-of-the-box data stack

Mozart Data founders Peter Fishman and Dan Silberman have been friends for over 20 years, working at various startups, and even launching a hot sauce company together along the way. As technologists, they saw companies building a data stack over and over. They decided to provide one for them and Mozart Data was born.

The company graduated from the Y Combinator Summer 2020 cohort in August and announced a $4 million seed round today led by Craft Ventures and Array Ventures with participation from Coelius Capital, Jigsaw VC, Signia VC, Taurus VC and various angel investors.

In spite of the detour into hot sauce, the two founders were mostly involved in data over the years and they formed strong opinions about what a data stack should look like. “We wanted to bring the same stack that we’ve been building at all these different startups, and make it available more broadly,” Fishman told TechCrunch.

They see a modern data stack as one that has different databases, SaaS tools and data sources. They pull it together, process it and make it ready for whatever business intelligence tool you use. “We do all of the parts before the BI tool. So we extract and load the data. We manage a data warehouse for you under the hood in Snowflake, and we provide a layer for you to do transformations,” he said.

The service is aimed mostly at technical people who know some SQL like data analysts, data scientists and sales and marketing operations. They founded the company earlier this year with their own money, and joined Y Combinator in June. Today, they have about a dozen customers and six employees. They expect to add 10-12 more in the next year.

Fishman says they have mostly hired from their networks, but have begun looking outward as they make their next hires with a goal of building a diverse company. In fact, they have made offers to several diverse candidates, who didn’t ultimately take the job, but he believes if you start looking at the top of the funnel, you will get good results. “I think if you spend a lot of energy in terms of top of funnel recruiting, you end up getting a good, diverse set at the bottom,” he said.

The company has been able to start from scratch in the midst of a pandemic and add employees and customers because the founders had a good network to pitch the product to, but they understand that moving forward they will have to move outside of that. They plan to use their experience as users to drive their message.

“I think talking about some of the whys and the rationale is our strategy for adding value to customers […], it’s about basically how would we set up a data stack if we were at this type of startup,” he said.


By Ron Miller

Rockset announces $40M Series B as data analytics solution gains momentum

Rockset, a cloud-native analytics company, announced a $40 million Series B investment today led by Sequoia with help from Greylock, the same two firms that financed its Series A. The startup has now raised a total of $61.5 million, according to the company.

As co-founder and CEO Venkat Venkataramani told me at the time of the Series A in 2018, there is a lot of manual work involved in getting data ready to use and it acts as a roadblock to getting to real insight. He hoped to change that with Rockset.

“We’re building out our service with innovative architecture and unique capabilities that allows full-featured fast SQL directly on raw data. And we’re offering this as a service. So developers and data scientists can go from useful data in any shape, any form to useful applications in a matter of minutes. And it would take months today,” he told me in 2018.

In fact, “Rockset automatically builds a converged index on any data — including structured, semi-structured, geographical and time series data — for high-performance search and analytics at scale,” the company explained.

It seems to be resonating with investors and customers alike as the company raised a healthy B round and business is booming. Rockset supplied a few metrics to illustrate this. For starters, revenue grew 290% in the last quarter. While they didn’t provide any foundational numbers for that percentage growth, it is obviously substantial.

In addition, the startup reports adding hundreds of new users, again not nailing down any specific numbers, and queries on the platform are up 313%. Without specifics, it’s hard to know what that means, but that seems like healthy growth for an early stage startup, especially in this economy.

Mike Vernal, a partner at Sequoia, sees a company helping to get data to work faster than other solutions, which require a lot of handling first. “Rockset, with its innovative new approach to indexing data, has quickly emerged as a true leader for real-time analytics in the cloud. I’m thrilled to partner with the company through its next phase of growth,” Vernal said in a statement.

The company was founded in 2016 by the creators of RocksDB. The startup had previously raised a $3 million seed round when they launched the company and the $18.5 million A round in 2018.


By Ron Miller

Data virtualization service Varada raises $12M

Varada, a Tel Aviv-based startup that focuses on making it easier for businesses to query data across services, today announced that it has raised a $12 million Series A round led by Israeli early-stage fund MizMaa Ventures, with participation by Gefen Capital.

“If you look at the storage aspect for big data, there’s always innovation, but we can put a lot of data in one place,” Varada CEO and co-founder Eran Vanounou told me. “But translating data into insight? It’s so hard. It’s costly. It’s slow. It’s complicated.”

That’s a lesson he learned during his time as CTO of LivePerson, which he described as a classic big data company. And just like at LivePerson, where the team had to reinvent the wheel to solve its data problems, again and again, every company — and not just the large enterprises — now struggles with managing their data and getting insights out of it, Vanounou argued.

Image Credits: Varada

The rest of the founding team, David Krakov, Roman Vainbrand and Tal Ben-Moshe, already had a lot of experience in dealing with these problems, too, with Ben-Moshe having served at the Chief Software Architect of Dell EMC’s XtremIO flash array unit, for example. They built the system for indexing big data that’s at the core of Varada’s platform (with the open-source Presto SQL query engine being one of the other cornerstones).

Image Credits: Varada

Essentially, Varada embraces the idea of data lakes and enriches that with its indexing capabilities. And those indexing capabilities is where Varada’s smarts can be found. As Vanounou explained, the company is using a machine learning system to understand when users tend to run certain workloads and then caches the data ahead of time, making the system far faster than its competitors.

“If you think about big organizations and think about the workloads and the queries, what happens during the morning time is different from evening time. What happened yesterday is not what happened today. What happened on a rainy day is not what happened on a shiny day. […] We listen to what’s going on and we optimize. We leverage the indexing technology. We index what is needed when it is needed.”

That helps speed up queries, but it also means less data has to be replicated, which also brings down the cost. AÅs Mizmaa’s Aaron Applebaum noted, since Varada is not a SaaS solution, the buyers still get all of the discounts from their cloud providers, too.

In addition, the system can allocate resources intelligently to that different users can tap into different amounts of bandwidth. You can tell it to give customers more bandwidth than your financial analysts, for example.

“Data is growing like crazy: in volume, in scale, in complexity, in who requires it and what the business intelligence uses are, what the API uses are,” Applebaum said when I asked him why he decided to invest. “And compute is getting slightly cheaper, but not really, and storage is getting cheaper. So if you can make the trade-off to store more stuff, and access things more intelligently, more quickly, more agile — that was the basis of our thesis, as long as you can do it without compromising performance.”

Varada, with its team of experienced executives, architects and engineers, ticked a lot of the company’s boxes in this regard, but he also noted that unlike some other Israeli startups, the team understood that it had to listen to customers and understand their needs, too.

“In Israel, you have a history — and it’s become less and less the case — but historically, there’s a joke that it’s ‘ready, fire, aim.’ You build a technology, you’ve got this beautiful thing and you’re like, ‘alright, we did it,’ but without listening to the needs of the customer,” he explained.

The Varada team is not afraid to compare itself to Snowflake, which at least at first glance seems to make similar promises. Vananou praised the company for opening up the data warehousing market and proving that people are willing to pay for good analytics. But he argues that Varada’s approach is fundamentally different.

“We embrace the data lake. So if you are Mr. Customer, your data is your data. We’re not going to take it, move it, copy it. This is your single source of truth,” he said. And in addition, the data can stay in the company’s virtual private cloud. He also argues that Varada isn’t so much focused on the business users but the technologists inside a company.

 


By Frederic Lardinois

Harbr emerges from stealth to help build online data marketplaces

Harbr co-founder Anthony Cosgrove has been working with data for over 15 years, so he has an inkling of some of the problems associated with pulling data together in a way that makes it easy for others to consume, whether internally or externally. Like many entrepreneurs before him, he decided to start a company to solve that problem, and today it came out of stealth.

Cosgrove explained that in his experience, data platforms of the past had several problems. “They were too slow. They were too expensive and too risky, and when you got the data you then ended up working in a silo with really no repeatability of anything that you did for anybody else in your organization,” he explained.

Cosgrove started Harbr because he saw a dearth of tools to help with these issues. “We wanted to create an environment where organizations could share their data, collaborate on that data and create new versions of that data that were really optimized for very specific use cases,” he said.

For now, the company is concentrating on large data vendors, helping them package and monetize the data they produce as a business more efficiently, but Cosgrove sees a time where he could be helping other firms that produce data as a byproduct of conducting business to monetize that data more easily.

He says these big data businesses generally lack the agility to package data in ways that make sense for each customer, and his company’s product should help solve that. “They’re able to start working directly with their customers to move away from kind of sending data to actually selling services, models or insights, which is what customers really want,” he said.

One other unique aspect of the tool is that it is a true platform, meaning that you are not just restricted to the data in your system. You can pull together other data sources as well, and that could make for even more interesting ways to package the data for customers.

The company launched in London in 2017 and spent some time building the product. It recently opened offices in the United States and currently has 30 employees divided between the two locations. It has raised $6.5 million in seed capital led by Boldstart Ventures .


By Ron Miller

Fishtown Analytics raises $12.9M Series A for its open-source analytics engineering tool

Philadelphia-based Fishtown Analytics, the company behind the popular open-source data engineering tool dbt, today announced that it has raised a $12.9 million Series A round led by Andreessen Horowitz, with the firm’s general partner Martin Casada joining the company’s board.

“I wrote this blog post in early 2016, essentially saying that analysts needed to work in a fundamentally different way,” Fishtown founder and CEO Tristan Handy told me, when I asked him about how the product came to be. “They needed to work in a way that much more closely mirrored the way the software engineers work and software engineers have been figuring this shit out for years and data analysts are still like sending each other Microsoft Excel docs over email.”

The dbt open-source project forms the basis of this. It allows anyone who can write SQL queries to transform data and then load it into their preferred analytics tools. As such, it sits in-between data warehouses and the tools that load data into them on one end, and specialized analytics tools on the other.

As Casada noted when I talked to him about the investment, data warehouses have now made it affordable for businesses to store all of their data before it is transformed. So what was traditionally “extract, transform, load” (ETL) has now become “extract, load, transform” (ELT). Andreessen Horowitz is already invested in Fivetran, which helps businesses move their data into their warehouses, so it makes sense for the firm to also tackle the other side of this business.

“Dbt is, as far as we can tell, the leading community for transformation and it’s a company we’ve been tracking for at least a year,” Casada said. He also argued that data analysts — unlike data scientists — are not really catered to as a group.

Before this round, Fishtown hadn’t raised a lot of money, even though it has been around for a few years now, except for a small SAFE round from Amplify.

But Handy argued that the company needed this time to prove that it was on to something and build a community. That community now consists of more than 1,700 companies that use the dbt project in some form and over 5,000 people in the dbt Slack community. Fishtown also now has over 250 dbt Cloud customers and the company signed up a number of big enterprise clients earlier this year. With that, the company needed to raise money to expand and also better service its current list of customers.

“We live in Philadelpha. The cost of living is low here and none of us really care to make a quadro-billion dollars, but we do want to answer the question of how do we best serve the community,” Handy said. “And for the first time, in the early part of the year, we were like, holy shit, we can’t keep up with all of the stuff that people need from us.”

The company plans to expand the team from 25 to 50 employees in 2020 and with those, the team plans to improve and expand the product, especially its IDE for data analysts, which Handy admitted could use a bit more polish.


By Frederic Lardinois

Pinpoint releases dashboard to bring visibility to software engineering operations

As companies look for better ways to understand how different departments work at a granular level, engineering has traditionally been a black box of siloed data. Pinpoint, an Austin-based startup has been working on a platform to bring this information into a single view, and today it released a dashboard to help companies understand what’s happening across software engineering from an operational perspective.

Jeff Haynie, co-founder and CEO at Pinpoint says the company’s mission for the last two years has been giving greater visibility into the  engineering department, something he says is even more important in the current context with workers spread out at home.

“Companies give engineering a bunch of money, and they build a bunch of amazing things, but in the end it is just a black box and we really don’t know what happens,” Haynie said. He says his company has been working to take all of the data to try and contextualize it, bring it together and correlate that information.

Today, they are introducing a dashboard that takes what they’ve been building and pulls it together into a single view, which is 100% self serve. Prior to this you needed a bunch of hand-holding from Pinpoint personnel to get it up and running, but today you can download the product and sign into your various services such as your git repository, your CI/CD software, your IDE and so forth.

What’s more it provides a way for engineering personnel to communicate with one another without leaving the tool.

Pinpoint software engineering dashboard. Image Credit: Pinpoint

“Obviously we will handhold and help people as they need it, and we have an enterprise version of the product with a higher level of SLA, and we have a customer success team to do that, but we’ve really focused this new release on purely self service,” Haynie said.

What’s more, while there is a free version already for teams under 10 people that’s free forever, with the release of today’s product, the company is offering unlimited access to the dashboard for free for three months.

Haynie says they’re like any startup right now, but having experience with several other startups and having lived through 9/11, the dot-com crash, 2008 and so forth, he knows how to hunker down and preserve cash. At the same time, he says they are seeing a lot of in-bound interest in the product, and they wanted to come up with a creative way to help customers through this crisis, while putting the product out there for people to use.

“We’re like any other startup or any other business frankly at this point: we’re nervous and scared. How do you survive this [and how long will it last]. The other side of it is that we’re rushing to take advantage of this inbound interest that we’re getting and trying to sort of seize the opportunity and try to be creative about how we help them.”

The startup hopes that if companies find the product useful, after three months they won’t mind paying for the full version. For now, it’s just putting it out there for free and seeing what happens with it — just another startup trying to find a way through this crisis.


By Ron Miller

Free tool helps manufacturers map where COVID-19 impacts supply chain

Assent Compliance, a company that helps large manufacturers like GE and Rolls Royce manage complex supply chains through an online data exchange, announced a new tool this week that lets any company, whether they’re a customer or not, upload bills of materials and see on a map where COVID-19 is having an impact on their supply chain.

Company co-founder Matt Whitteker, says the Ottawa startup focuses on supply chain data management, which means it has the data and the tooling to develop a data-driven supply chain map based on WHO data identifying COVID hotspots. He believes that his is the only company to have done this.

“We’re the only ones that have taken supply chain data and applied it to this particular pandemic. And it’s something that’s really native to our platform. We have all that data on hand — we have location data for suppliers. So it’s just a matter of applying that with third party data sources (like the WHO data), and then extracting valuable business intelligence from it,” he said.

If you want to participate, you simply go to the company website and fill out a form. A customer success employee will contact you and walk you through the process of uploading your data to the platform. Once they have your data, they generate a map showing the parts of the world where your supply chain is most likely to be disrupted, identifying the level of risk based on your individual data.

The company captures supply chain data as part of the act of doing business with 1000 customers and 500,000 suppliers currently on their platform. “When companies are manufacturing products they have what’s called a bill of materials, kind of like a recipe. And companies upload their bill of materials that basically outlines all their parts, components and commodities, and who they get them from, which basically represents their supply chain,” Whitteker explained.

After the company uploads the bill of materials, Assent opens a portal for the companies to exchange data, which might be tax forms, proof of sourcing or any kind of information and documentation the manufacturer needs to comply with legal and regulatory rules around procurement of a given part.

They decided to start building the COVID-19 map application when they recognized that this was going to have the biggest supply chain disruption the world has seen since World War II. It took about a month to build it. It went into Beta last week with customers and over 350 signed up in the first two hours. This week, they made the tool generally available to anyone, even non-customers, for free.

The company was founded in 2016 and raised $220 million, according to Whitteker.


By Ron Miller

Datastax acquires The Last Pickle

Data management company Datastax, one of the largest contributors to the Apache Cassandra project, today announced that it has acquired The Last Pickle (and no, I don’t know what’s up with that name either), a New Zealand-based Cassandra consulting and services firm that’s behind a number of popular open-source tools for the distributed NoSQL database.

As Datastax Chief Strategy Officer Sam Ramji, who you may remember from his recent tenure at Apigee, the Cloud Foundry Foundation, Google and Autodesk, told me, The Last Pickle is one of the premier Apache Cassandra consulting and services companies. The team there has been building Cassandra-based open source solutions for the likes of Spotify, T Mobile and AT&T since it was founded back in 2012. And while The Last Pickle is based in New Zealand, the company has engineers all over the world that do the heavy lifting and help these companies successfully implement the Cassandra database technology.

It’s worth mentioning that Last Pickle CEO Aaron Morton first discovered Cassandra when he worked for WETA Digital on the special effects for Avatar, where the team used Cassandra to allow the VFX artists to store their data.

“There’s two parts to what they do,” Ramji explained. “One is the very visible consulting, which has led them to become world experts in the operation of Cassandra. So as we automate Cassandra and as we improve the operability of the project with enterprises, their embodied wisdom about how to operate and scale Apache Cassandra is as good as it gets — the best in the world.” And The Last Pickle’s experience in building systems with tens of thousands of nodes — and the challenges that its customers face — is something Datastax can then offer to its customers as well.

And Datastax, of course, also plans to productize The Last Pickle’s open-source tools like the automated repair tool Reaper and the Medusa backup and restore system.

As both Ramji and Datastax VP of Engineering Josh McKenzie stressed, Cassandra has seen a lot of commercial development in recent years, with the likes of AWS now offering a managed Cassandra service, for example, but there wasn’t all that much hype around the project anymore. But they argue that’s a good thing. Now that it is over ten years old, Cassandra has been battle-hardened. For the last ten years, Ramji argues, the industry tried to figure out what the de factor standard for scale-out computing should be. By 2019, it became clear that Kubernetes was the answer to that.

“This next decade is about what is the de facto standard for scale-out data? We think that’s got certain affordances, certain structural needs and we think that the decades that Cassandra has spent getting harden puts it in a position to be data for that wave.”

McKenzie also noted that Cassandra provides users with a number of built-in features like support for mutiple data centers and geo-replication, rolling updates and live scaling, as well as wide support across programming languages, give it a number of advantages over competing databases.

“It’s easy to forget how much Cassandra gives you for free just based on its architecture,” he said. “Losing the power in an entire datacenter, upgrading the version of the database, hardware failing every day? No problem. The cluster is 100 percent always still up and available. The tooling and expertise of The Last Pickle really help bring all this distributed and resilient power into the hands of the masses.”

The two companies did not disclose the price of the acquisition.


By Frederic Lardinois

Fivetran hauls in $44M Series B as data pipeline business booms

Fivetran, a startup that helps companies move data from disparate repositories to data warehouses, announced $44 million Series B financing today, less than a year after collecting a $15 million Series A round.

Andreessen Horowitz (A16Z) led the round with participation from existing investors Matrix Partners and CEAS Investments. As part of the deal, Martin Casado from A16Z will join the Fivetran board. Today’s investment brings the total raised to over $59 million, according to Crunchbase.

Company co-founder and CEO George Fraser said they raised a little sooner than expected, but they needed a cash infusion to keep up with the steady growth they have been seeing. He said the company also wanted to get the funding done while the capital markets were still strong. “If we wait four months or six months, the terms are not going to be that much better — and, who knows, there could be a recession. You never know how long the sun shines, and we had interest from some really good firms that we liked, and that’s a big factor too obviously.” he said.

He added that it’s not purely an economic decision. “We’re really happy with where we landed with Martin [Casado] joining the board and Andreessen Horowitz on the cap table, but [the economic outlook] was definitely part of our calculus.”

And Casado is happy to have invested in Fivetran. Writing in a blog post today about the investment, he sees a company that’s solving a big problem in a modern context. “Fivetran is a SaaS service that connects to the critical data sources in an organization, pulls and processes all the data, and then dumps it into a warehouse (e.g., Snowflake, BigQuery or RedShift) for SQL access and further transformations, if needed. If data is the new oil, then Fivetran is the pipes that get it from the source to the refinery,” he wrote.

He said that the company already has over 750 customers and A16Z is included among them. It certainly doesn’t hurt when your lead investor uses your product.

The company was founded in 2012 and has been growing steadily. Last year it 80 employees at the time of its Series A and today it has 175. Fraser expects that to double again over the next year, and it’s all driven by business needs. He says that over the last 12 months revenue has grown 3x.

With 150 connectors today, the company wants to continue to expand its array of data connection tools and cover more data requirements. But he says the connectors are complicated and that will take an investment in more engineering talent. Today’s announcement should help in that regard.


By Ron Miller

Tableau update uses AI to increase speed to insight

Tableau was acquired by Salesforce earlier this year for $15.7 billion, but long before that, the company had been working on its Fall update, and today it announced several new tools including a new feature called ‘Explain Data’ that uses AI to get to insight quickly.

“What Explain Data does is it moves users from understanding what happened to why it might have happened by automatically uncovering and explaining what’s going on in your data. So what we’ve done is we’ve embedded a sophisticated statistical engine in Tableau, that when launched automatically analyzes all the data on behalf of the user, and brings up possible explanations of the most relevant factors that are driving a particular data point,” Tableau chief product officer, Francois Ajenstat explained.

He added that what this really means is that it saves users time by automatically doing the analysis for them, and It should help them do better analysis by removing biases and helping them dive deep into the data in an automated fashion.

Explain Data Superstore extreme value

Image: Tableau

Ajenstat says this is a major improvement in that previously users would have do all of this work manually. “So a human would have to go through every possible combination, and people would find incredible insights, but it was manually driven. Now with this engine, they are able to essentially drive automation to find those insights automatically for the users,” he said.

He says this has two major advantages. First of all, because it’s AI-driven it can deliver meaningful insight much faster, but it also it gives a more of a rigorous perspective of the data.

In addition, the company announced a new Catalog feature, which provides data bread crumbs with the source of the data, so users can know where the data came from, and whether it’s relevant or trustworthy.

Finally, the company announced a new server management tool that helps companies with broad Tableau deployment across a large organization to manage those deployments in a more centralized way.

All of these features are available starting today for Tableau customers.


By Ron Miller

Fungible raises $200 million led by SoftBank Vision Fund to help companies handle increasingly massive amounts of data

Fungible, a startup that wants to help data centers cope with the increasingly massive amounts of data produced by new technologies, has raised a $200 million Series C led by SoftBank Vision Fund, with participation from Norwest Venture Partners and its existing investors. As part of the round, SoftBank Investment Advisers senior managing partner Deep Nishar will join Fungible’s board of directors.

Founded in 2015, Fungible now counts about 200 employees and has raised more than $300 million in total funding. Its other investors include Battery Ventures, Mayfield Fund, Redline Capital and Walden Riverwood Ventures. Its new capital will be used to speed up product development. The company’s founders, CEO Pradeep Sindhu and Bertrand Serlet, say Fungible will release more information later this year about when its data processing units will be available and their on-boarding process, which they say will not require clients to change their existing applications, networking or server design.

Sindu previously founded Juniper Networks, where he held roles as chief scientist and CEO. Serlet was senior vice president of software engineering at Apple before leaving in 2011 and founding Upthere, a storage startup that was acquired by Western Digital in 2017. Sindu and Serlet describe Fungible’s objective as pivoting data centers from a “compute-centric” model to a data-centric one. While the company is often asked if they consider Intel and Nvidia competitors, they say Fungible Data Processing Units (DPU) complement tech, including central and graphics processing units, from other chip makers.

Sindhu describes Fungible’s DPUs as a new building block in data center infrastructure, allowing them to handle larger amounts of data more efficiently and also potentially enabling new kinds of applications. Its DPUs are fully programmable and connect with standard IPs over Ethernet local area networks and local buses, like the PCI Express, that in turn connect to CPUs, GPUs and storage. Placed between the two, the DPUs act like a “super-charged data traffic controller,” performing computations offloaded by the CPUs and GPUs, as well as converting the IP connection into high-speed data center fabric.

This better prepares data centers for the enormous amounts of data generated by new technology, including self-driving cars, and industries such as personalized healthcare, financial services, cloud gaming, agriculture, call centers and manufacturing, says Sindu.

In a press statement, Nishar said “As the global data explosion and AI revolution unfold, global computing, storage and networking infrastructure are undergoing a fundamental transformation. Fungible’s products enable data centers to leverage their existing hardware infrastructure and benefit from these new technology paradigms. We look forward to partnering with the company’s visionary and accomplished management team as they power the next generation of data centers.”


By Catherine Shu

With Tableau and Mulesoft, Salesforce gains full view of enterprise data

Back in the 2010 timeframe, it was common to say that content was king, but after watching Google buy Looker for $2.6 billion last week and Salesforce nab Tableau for $15.7 billion this morning, it’s clear that data has ascended to the throne in a business context.

We have been hearing about Big Data for years, but we’ve probably reached a point in 2019 where the data onslaught is really having an impact on business. If you can find the key data nuggets in the big data pile, it can clearly be a competitive advantage, and companies like Google and Salesforce are pulling out their checkbooks to make sure they are in a position to help you out.

While Google, as a cloud infrastructure vendor, is trying to help companies on its platform and across the cloud understand and visualize all that data, Salesforce as a SaaS vendor might have a different reason — one that might surprise you — given that Salesforce was born in the cloud. But perhaps it recognizes something fundamental. If it truly wants to own the enterprise, it has to have a hybrid story, and with Mulesoft and Tableau, that’s precisely what it has — and why it was willing to spend around $23 billion to get it.

Making connections

Certainly, Salesforce chairman Marc Benioff has no trouble seeing the connections between his two big purchases over the last year. He sees the combination of Mulesoft connecting to the data sources and Tableau providing a way to visualize as a “beautiful thing.”


By Ron Miller

Couchbase’s mobile database gets built-in ML and enhanced synchronization features

Couchbase, the company behind the eponymous NoSQL database, announced a major update to its mobile database today that brings some machine learning smarts, as well as improved synchronization features and enhanced stats and logging support to the software.

“We’ve led the innovation and data management at the edge since the release of our mobile database five years ago,” Couchbase’s VP of Engineering Wayne Carter told me. “And we’re excited that others are doing that now. We feel that it’s very, very important for businesses to be able to utilize these emerging technologies that do sit on the edge to drive their businesses forward, and both making their employees more effective and their customer experience better.”

The latter part is what drove a lot of today’s updates, Carter noted. He also believes that the database is the right place to do some machine learning. So with this release, the company is adding predictive queries to its mobile database. This new API allows mobile apps to take pre-trained machine learning models and run predictive queries against the data that is stored locally. This would allow a retailer to create a tool that can use a phone’s camera to figure out what part a customer is looking for.

To support these predictive queries, Couchbase mobile is also getting support for predictive indexes. “Predictive indexes allow you to create an index on prediction, enabling correlation of real-time predictions with application data in milliseconds,” Carter said. In many ways, that’s also the unique value proposition for bringing machine learning into the database. “What you really need to do is you need to utilize the unique values of a database to be able to deliver the answer to those real-time questions within milliseconds,” explained Carter.

The other major new feature in this release is delta synchronization, which allows businesses to push far smaller updates to the databases on their employees mobile devices. That’s because they only have to receive the information that changed instead of a full updated database. Carter says this was a highly requested feature but until now, the company always had to prioritize work on other components of Couchbase.

This is an especially useful feature for the company’s retail customers, a vertical where it has been quite successful. These users need to keep their catalogs up to data and quite a few of them supply their employees with mobile devices to help shoppers. Rumor has it that Apple, too, is a Couchbase user.

The update also includes a few new features that will be more of interest to operators, including advanced stats reporting and enhanced logging support.

 


By Frederic Lardinois

Why Daimler moved its big data platform to the cloud

Like virtually every big enterprise company, a few years ago, the German auto giant Daimler decided to invest in its own on-premises data centers. And while those aren’t going away anytime soon, the company today announced that it has successfully moved its on-premises big data platform to Microsoft’s Azure cloud. This new platform, which the company calls eXtollo, is Daimler’s first major service to run outside of its own data centers, though it’ll probably not be the last.

As Daimler’s head of its corporate center of excellence for advanced analytics and big data Guido Vetter told me, that the company started getting interested in big data about five years ago. “We invested in technology — the classical way, on-premise — and got a couple of people on it. And we were investigating what we could do with data because data is transforming our whole business as well,” he said.

By 2016, the size of the organization had grown to the point where a more formal structure was needed to enable the company to handle its data at a global scale. At the time, the buzzword was ‘data lakes’ and the company started building its own in order to build out its analytics capacities.

Electric Line-Up, Daimler AG

“Sooner or later, we hit the limits as it’s not our core business to run these big environments,” Vetter said. “Flexibility and scalability are what you need for AI and advanced analytics and our whole operations are not set up for that. Our backend operations are set up for keeping a plant running and keeping everything safe and secure.” But in this new world of enterprise IT, companies need to be able to be flexible and experiment — and, if necessary, throw out failed experiments quickly.

So about a year and a half ago, Vetter’s team started the eXtollo project to bring all the company’s activities around advanced analytics, big data and artificial intelligence into the Azure Cloud and just over two weeks ago, the team shut down its last on-premises servers after slowly turning on its solutions in Microsoft’s data centers in Europe, the U.S. and Asia. All in all, the actual transition between the on-premises data centers and the Azure cloud took about nine months. That may not seem fast, but for an enterprise project like this, that’s about as fast as it gets (and for a while, it fed all new data into both its on-premises data lake and Azure).

If you work for a startup, then all of this probably doesn’t seem like a big deal, but for a more traditional enterprise like Daimler, even just giving up control over the physical hardware where your data resides was a major culture change and something that took quite a bit of convincing. In the end, the solution came down to encryption.

“We needed the means to secure the data in the Microsoft data center with our own means that ensure that only we have access to the raw data and work with the data,” explained Vetter. In the end, the company decided to use thethAzure Key Vault to manage and rotate its encryption keys. Indeed, Vetter noted that knowing that the company had full control over its own data was what allowed this project to move forward.

Vetter tells me that the company obviously looked at Microsoft’s competitors as well, but he noted that his team didn’t find a compelling offer from other vendors in terms of functionality and the security features that it needed.

Today, Daimler’s big data unit uses tools like HD Insights and Azure Databricks, which covers more than 90 percents of the company’s current use cases. In the future, Vetter also wants to make it easier for less experienced users to use self-service tools to launch AI and analytics services.

While cost is often a factor that counts against the cloud since renting server capacity isn’t cheap, Vetter argues that this move will actually save the company money and that storage cost, especially, are going to be cheaper in the cloud than in its on-premises data center (and chances are that Daimler, given its size and prestige as a customer, isn’t exactly paying the same rack rate that others are paying for the Azure services).

As with so many big data AI projects, predictions are the focus of much of what Daimler is doing. That may mean looking at a car’s data and error code and helping the technician diagnose an issue or doing predictive maintenance on a commercial vehicle. Interestingly, the company isn’t currently bringing any of its own IoT data from its plants to the cloud. That’s all managed in the company’s on-premises data centers because it wants to avoid the risk of having to shut down a plant because its tools lost the connection to a data center, for example.


By Frederic Lardinois