AI could help push Neo4j graph database growth

Graph databases have always been useful to help find connections across a vast data set, and it turns out that capability is quite handy in artificial intelligence and machine learning too. Today, Neo4j, the makers of the open source and commercial graph database platform, announced the release of Neo4j 3.5, which has a number of new features aimed specifically at AI and machine learning.

Neo4j founder and CEO Emil Eifrem says he had recognized the connection between AI and machine learning and graph databases for awhile, but he says that it has taken some time for the market to catch up to the idea.

“There has been a lot momentum around AI and graphs…Graphs are very fundamental to AI. At the same time we were seeing some early use cases, but not really broad adoption, and that’s what we’re seeing right now,” he explained.

AI graph uses cases. Graphic: Neo4j

To help advance AI uses cases, today’s release includes a new full text search capability, which Eifrem says has been one of the most requested features. This is important because when you are making connections between entities, you have to be able to find all of the examples regardless of how it’s worded — for example, human versus humans versus people.

Part of that was building their own indexing engine to increase indexing speed, which becomes essential with ever more data to process. “Another really important piece of functionality is that we have improved our data ingestion very significantly. We have 5x end-to-end performance improvements when it comes to importing data. And this is really important for connected feature extraction, where obviously, you need a lot of data to be able to train the machine learning,” he said. That also means faster sorting of data too.

Other features in the new release include improvements to the company’s own Cypher database query language and better visualization of the graphs to give more visibility, which is useful for visualizing how machine learning algorithms work, which is known as AI explainability. They also announced support for the Go language and increased security.

Graph databases are growing increasingly important as we look to find connections between data. The most common use case is the knowledge graph, which is what lets us see connections in a huge data sets. Common examples include who we are connected to on a social network like Facebook, or if we bought one item, we might like similar items on an ecommerce site.

Other use cases include connected feature extraction, a common machine learning training techniques that can look at a lot of data and extract the connections, the context and the relationships for a particular piece of data, such as suspects in a criminal case and the people connected to them.

Neo4j has over 300 large enterprise customers including Adobe, Microsoft, Walmart, UBS and NASA. The company launched in 2007 and has raised $80 million. The last round was $36 million in November 2016.


By Ron Miller

Outlier raises $6.2 M Series A to change how companies use data

Traditionally, companies have gathered data from a variety of sources, then used spreadsheets and dashboards to try and make sense of it all. Outlier wants to change that and deliver a handful of insights that matter most for your job, company and industry right to your inbox. Today the company announced a $6.2 million Series A to further develop that vision.

The round was led by Ridge Ventures with assistance from 11.2 Capital, First Round Capital, Homebrew, Susa Ventures and SV Angel. The company has raised over $8 million.

The startup is trying to solve a difficult problem around delivering meaningful insight without requiring the customer to ask the right questions. With traditional BI tools, you get your data and you start asking questions and seeing if the data can give you some answers. Outliers wants to bring a level of intelligence and automation by pointing out insight without having to explicitly ask the right question.

Company founder and CEO Sean Byrnes says his previous company, Flurry, helped deliver mobile analytics to customers, but in his travels meeting customers in that previous iteration, he always came up against the same question: “This is great, but what should I look for in all that data?”

It was such a compelling question that after he sold Flurry in 2014 to Yahoo for more than $200 million, that question stuck in the back of his mind and he decided to start a business to solve it. He contends that the first 15 years of BI was about getting answers to basic questions about company performance, but the next 15 will be about finding a way to get the software to ask good questions for you based on the huge amounts data.

Byrnes admits that when he launched, he didn’t have much sense of how to put this notion into action, and most people he approached didn’t think it was a great idea. He says he heard “No” from a fair number of investors early on because the artificial intelligence required to fuel a solution like this really wasn’t ready in 2015 when he started the company.

He says that it took four or five iterations to get to today’s product, which lets you connect to various data sources, and using artificial intelligence and machine learning delivers a list of four or five relevant questions to the user’s email inbox that points out data you might not have noticed, what he calls “shifts below the surface.” If you’re a retailer that could be changing market conditions that signal you might want to change your production goals.

Outlier email example. Photo: Outlier

The company launched in 2015. It took some time to polish the product, but today they have 14 employees and 14 customers including Jack Rogers, Celebrity Cruises and Swarovski.

This round should allow them to continuing working to grow the company. “We feel like we hit the right product-market fit because we have customers [generating] reproducible results and really changing the way people use the data,” he said.


By Ron Miller

SessionM customer loyalty data aggregator snags $23.8 M investment

SessionM announced a $23.8 million Series E investment led by Salesforce Ventures. A bushel of existing investors including Causeway Media Partners, CRV, General Atlantic, Highland Capital and Kleiner Perkins Caufield & Byers also contributed to the round. The company has now raised over $97 million.

At its core, SessionM aggregates loyalty data for brands to help them understand their customer better, says company co-founder and CEO Lars Albright. “We are a customer data and engagement platform that helps companies build more loyal and profitable relationships with their consumers,” he explained.

Essentially that means, they are pulling data from a variety of sources and helping brands offer customers more targeted incentives, offers and product recommendations “We give [our users] a holistic view of that customer and what motivates them,” he said.

Screenshot: SessionM (cropped)

To achieve this, SessionM takes advantage of machine learning to analyze the data stream and integrates with partner platforms like Salesforce, Adobe and others. This certainly fits in with Adobe’s goal to build a customer service experience system of record and Salesforce’s acquisition of Mulesoft in March to integrate data from across an organization, all in the interest of better understanding the customer.

When it comes to using data like this, especially with the advent of GDPR in the EU in May, Albright recognizes that companies need to be more careful with data, and that it has really enhanced the sensitivity around stewardship for all data-driven businesses like his.

“We’ve been at the forefront of adopting the right product requirements and features that allow our clients and businesses to give their consumers the necessary control to be sure we’re complying with all the GDPR regulations,” he explained.

The company was not discussing valuation or revenue. Their most recent round prior to today’s announcement, was a Series D in 2016 for $35 million also led by Salesforce Ventures.

SessionM, which was founded in 2011, has around 200 employees with headquarters in downtown Boston. Customers include Coca-Cola, L’Oreal and Barney’s.


By Ron Miller

Sumo Logic brings data analysis to containers

Sumo Logic has long held the goal to help customers understand their data wherever it lives. As we move into the era of containers, that goal becomes more challenging because containers by their nature are ephemeral. The company announced a product enhancement today designed to instrument containerized applications in spite of that.

They are debuting these new features at DockerCon, Docker’s customer conference taking place this week in San Francisco.

Sumo’s CEO Ramin Sayer says containers have begun to take hold over the last 12-18 months with Docker and Kubernetes emerging as tools of choice. Given their popularity, Sumo wants to be able to work with them. “[Docker and Kubernetes] are by far the most standard things that have developed in any new shop, or any existing shop that wants to build a brand new modern app or wants to lift and shift an app from on prem [to the cloud], or have the ability to migrate workloads from Vendor A platform to Vendor B,” he said.

He’s not wrong of course. Containers and Kubernetes have been taking off in a big way over the last 18 months and developers and operations alike have struggled to instrument these apps to understand how they behave.

“But as that standardization of adoption of that technology has come about, it makes it easier for us to understand how to instrument, collect, analyze, and more importantly, start to provide industry benchmarks,” Sayer explained.

They do this by avoiding the use of agents. Regardless of how you run your application, whether in a VM or a container, Sumo is able to capture the data and give you feedback you might otherwise have trouble retrieving.

Screen shot: Sumo Logic (cropped)

The company has built in native support for Kubernetes and Amazon Elastic Container Service for Kubernetes (Amazon EKS). It also supports the open source tool Prometheus favored by Kubernetes users to extract metrics and metadata. The goal of the Sumo tool is to help customers fix issues faster and reduce downtime.

As they work with this technology, they can begin to understand norms and pass that information onto customers. “We can guide them and give them best practices and tips, not just on what they’ve done, but how they compare to other users on Sumo,” he said.

Sumo Logic was founded in 2010 and has raised $230 million, according to data on Crunchbase. Its most recent round was a $70 million Series F led by Sapphire Ventures last June.


By Ron Miller

Devo scores $25 million and cool new name

Logtrust is now known as Devo in one of the cooler name changes I’ve seen in a long time. Whether they intended to pay homage to the late 70s band is not clear, but investors probably didn’t care, as they gave the data operations startup a bushel of money today.

The company now known as Devo announced a $25 million Series C round led by Insight Venture Partners with participation from Kibo Ventures. Today’s investment brings the total raised to $71 million.

The company changed its name because it was about much more than logs, according to CEO Walter Scott. It offers a cloud service that allows customers to stream massive amounts of data — think terabytes or even petabytes — relieving the need to worry about all of the scaling and hardware requirements processing this amount of data would require. That could be from logs from web servers, security data from firewalls or transactions taking place on backend systems, as some examples.

The data can live on prem if required, but the processing always gets done in the cloud to provide for the scaling needs. Scott says this is about giving companies this ability to process and understand massive amounts of data that previously was only in reach of web scale companies like Google, Facebook or Amazon.

But it involves more than simply collecting the data. “It’s the combination of us being able to collect all of that data together with running analytics on top of it all in a unified platform, then allowing a very broad spectrum of the business [to make use of it],” Scott explained.

Devo dashboard. Photo: Devo

Devo sees Sumo Logic, Elastic and Splunk as its primary competitors in this space, but like many startups they often battle companies trying to build their own systems as well, a difficult approach for any company to take when you are dealing with this amount of data.

The company, which was founded in Spain is now based in Cambridge, Massachusetts, and has close to 100 employees. Scott says he has the budget to double that by the end of the year, although he’s not sure they will be able to hire that many people that rapidly


By Ron Miller

Etleap scores $1.5 million seed to transform how we ingest data

Etleap is a play on words for a common set of data practices: extract, transform and load. The startup is trying to place these activities in a modern context, automating what they can and in general speeding up what has been a tedious and highly technical practice. Today, they announced a $1.5 million seed round.

Investors include First Round Capital, SV Angel, Liquid2, BoxGroup and other unnamed investors. The startup launched five years ago as a Y Combinator company. It spent a good 2.5 years building out the product says CEO and founder Christian Romming. They haven’t required additional funding up until now because they have been working with actual customers. Those include Okta, PagerDuty and Mode among others.

Romming started out at ad tech startup VigLink and while there he encounter a problem that was hard to solve. “Our analysts and scientists were frustrated. Integration of the data sources wasn’t always a priority and when something broke, they couldn’t get it fixed until a developer looked at it.” That lack of control slowed things down and made it hard to keep the data warehouse up-to-date.

He saw an opportunity in solving that problem and started Etleap. While there were (and continue to be) legacy solutions like Informatica, Talend and Microsoft SQL Server Integration Services, he said when he studied these at a deeply technical level, he found they required a great deal of help to implement. He wanted to simplify ETL as much as possible, putting data integration into the hands of much less technical end users, rather than relying on IT and consultants.

One of the problems with traditional ETL is that the data analysts who make use of the data tend to get involved very late after the tools have already been chosen and Romming says his company wants to change that. “They get to consume whatever IT has created for them. You end up with a bread line where analysts are at the mercy of IT to get their jobs done. That’s one of the things we are trying to solve. We don’t think there should be any engineering at all to set up ETL pipeline,” he said.

Etleap is delivered as managed SaaS or you can run it within your company’s AWS accounts. Regardless of the method, it handles all of the managing, monitoring and operations for the customer.

Romming emphasizes that the product is really built for cloud data warehouses. For now, they are concentrating on the AWS ecosystem, but have plans to expand beyond that down the road. “We want help more enterprise companies make better use of their data, while modernizing data warehousing infrastructure and making use of cloud data warehouses,” he explained.

The company is currently has 15 employees, but Romming plans to at least double that in the next 12-18 months, mostly increasing the engineering team to help further build out the product and create more connectors.


By Ron Miller

Splunk turns data processing chops to Industrial IoT

Splunk has always been known as a company that can sift through oodles of log or security data and help customers surface the important bits. Today, it announced it was going to try to apply that same skill set to Industrial Internet of Things data.

IIoT is data found in manufacturing settings, typically come from sensors on the factory floor giving engineers and plant managers data about the health and well-being of the machines running in the facility. Up until now, that data hasn’t had a modern place to live. Traditionally, companies pull the data into Excel and try to slice and dice it to find the issues

Splunk wants to change that with Splunk Industrial Asset Intelligence (IAI). The latest product pulls data from a variety of sources where it can be presented to management and engineers with the information they need to see along with critical alerts.

The new product takes advantage of some existing Splunk tools being built on top of Splunk Enterprise, but instead of processing data coming from IT systems, it’s looking at Industrial Control Systems (ICS), sensors, SCADA (supervisory control and data acquisition) systems and applications and pulling all that data together and presenting it to the key constituencies in a dashboard.

It is not a simple matter, however, to set up these dashboards, pull the data from the various data sources, some of which may be modern and some quite old, and figure out what’s important for a particular customer. Splunk says it has turned to systems integrators to help with that part of the implementation.

Splunk understands data, but it also recognizes working in the manufacturing sector is new territory for them, so they are looking to SIs with expertise in manufacturing to help them work with the unique requirements of this group. But it’s still data says Ammar Maraqa. Splunk SVP of Business Operations And Strategy and General Manager of IoT Markets

“If you step back at the end of the day, Splunk is able to ingest and correlate heterogeneous sets of data to provide a view into what’s happening in their environments,” Maraqa said.

With today’s announcement, Splunk Industrial Asset Intelligence exits Beta for a limited release. It should be generally available sometime in the Fall.

IoT devices could be next customer data frontier

At the Adobe Summit this week in Las Vegas, the company introduced what could be the ultimate customer experience construct, a customer experience system of record that pulls in information, not just from Adobe tools, but wherever it lives. In many ways it marked a new period in the notion of customer experience management, putting it front and center of the marketing strategy.

Adobe was not alone, of course. Salesforce, with its three-headed monster, the sales, marketing and service clouds, was also thinking of a similar idea. In fact, they spent $6.5 billion dollars last week to buy MuleSoft to act as a data integration layer to access  customer information from across the enterprise software stack, whether on prem, in the cloud, or inside or outside of Salesforce. And they announced the Salesforce Integration Cloud this week to make use of their newest company.

As data collection takes center stage, we actually could be on the edge of yet another data revolution, one that could be more profound than even the web and mobile were before it. That is…the Internet of Things.

Here comes IoT

There are three main pieces to that IoT revolution at the moment from a consumer perspective. First of all, there is the smart speaker like the Amazon Echo or Google Home. These provide a way for humans to interact verbally with machines, a notion that is only now possible through the marriage of all this data, sheer (and cheap) compute power and the AI algorithms that fuel all of it.

Next, we have the idea of a connected car, one separate from the self-driving car. Much like the smart speaker, humans can interact with the car, to find directions and recommendations and that leaves a data trail in its wake. Finally we, have sensors like iBeacons sitting in stores, providing retailers with a world of information about a customer’s journey through the store — what they like or don’t like, what they pick up, what they try on and so forth.

There are very likely a host of other categories too, and all of this information is data that needs to be processed and understood just like any other signals coming from customers, but it also has unique characteristics around the volume and velocity of this data — it is truly big data with all of the issues inherent in processing that amount of data.

The means it needs to be ingested, digested and incorporated into that central customer record-keeping system to drive the content and experiences you need to create to keep your customers happy — or so the marketing software companies tell us, at least. (We also need to consider the privacy implications of such a record, but that is the subject for another article.)

Building a better relationship

Regardless of the vendor, all of this is about understanding the customer better to provide a central data gathering system with the hope of giving people exactly what they want. We are no longer a generic mass of consumers. We are instead individuals with different needs, desires and requirements, and the best way to please us they say, is to understand us so well, that the brand can deliver the perfect experience at exactly the right moment.

Photo: Ron Miller

That involves listening to the digital signals we give off without even thinking about it. We carry mobile, connected computers in our pockets and they send out a variety of information about our whereabouts and what we are doing. Social media acts as a broadcast system that brands can tap into to better understand us (or so the story goes).

Part of what Adobe, Salesforce and others can deliver is a way to gather that information, pull it together into his uber record keeping system and apply a level of machine and learning and intelligence to help further the brand’s ultimate goals of serving a customer of one and delivering an efficient (and perhaps even pleasurable) experience.

Getting on board

At an Adobe Summit session this week on IoT (which I moderated), the audience was polled a couple of times. In one show of hands, they were asked how many owned a smart speaker and about three quarters indicated they owned at least one, but when asked how many were developing applications for these same devices only a handful of hands went up. This was in a room full of marketers, mind you.

Photo: Ron Miller

That suggests that there is a disconnect between usage and tools to take advantage of them. The same could be said for the other IoT data sources, the car and sensor tech, or any other connected consumer device. Just as we created a set of tools to capture and understand the data coming from mobile apps and the web, we need to create the same thing for all of these IoT sources.

That means coming up with creative ways to take advantage of another interaction (and data collection) point. This is an entirely new frontier with all of the opportunity involved in that, and that suggests startups and established companies alike need to be thinking about solutions to help companies do just that.

Pure Storage teams with Nvidia on GPU-fueled Flash storage solution for AI

As companies gather increasing amounts of data, they face a choice over bottlenecks. They can have it in the storage component or the backend compute system. Some companies have attacked the problem by using GPUs to streamline the back end problem or Flash storage to speed up the storage problem. Pure Storage wants to give customers the best of both worlds.

Today it announced, Airi, a complete data storage solution for AI workloads in a box.

Under the hood Airi starts with a Pure Storage FlashBlade, a storage solution that Pure created specifically with AI and machine learning kind of processing in mind. NVidia contributes the pure power with four NVIDIA DGX-1 supercomputers, delivering four petaFLOPS of performance with NVIDIA ® Tesla ® V100 GPUs. Arista provides the networking hardware to make it all work together with Arista 100GbE switches. The software glue layer comes from the NVIDIA GPU Cloud deep learning stack and Pure Storage AIRI Scaling Toolkit.

Photo: Pure Storage

One interesting aspect of this deal is that the FlashBlade product operates as a separate product inside of the Pure Storage organization. They have put together a team of engineers with AI and data pipeline understanding with the focus inside the company on finding ways to move beyond the traditional storage market and find out where the market is going.

This approach certainly does that, but the question is do companies want to chase the on-prem hardware approach or take this kind of data to the cloud. Pure would argue that the data gravity of AI workloads would make this difficult to achieve with a cloud solution, but we are seeing increasingly large amounts of data moving to the cloud with the cloud vendors providing tools for data scientists to process that data.

If companies choose to go the hardware route over the cloud, each vendor in this equation — whether Nvidia, Pure Storage or Arista — should benefit from a multi-vendor sale. The idea ultimately is to provide customers with a one-stop solution they can install quickly inside a data center if that’s the approach they want to take.