Tecton teams with founder of Feast open source machine learning feature store

Tecton, the company that pioneered the notion of the machine learning feature store, has teamed up with the founder of the open source feature store project called Feast. Today the company announced the release of version 0.10 of the open source tool.

The feature store is a concept that the Tecton founders came up with when they were engineers at Uber. Shortly thereafter an engineer named Willem Pienaar read the founder’s Uber blog posts on building a feature store and went to work building Feast as an open source version of the concept.

“The idea of Tecton [involved bringing] feature stores to the industry, so we build basically the best in class, enterprise feature store. […] Feast is something that Willem created, which I think was inspired by some of the early designs that we published at Uber. And he built Feast and it evolved as kind of like the standard for open source feature stores, and it’s now part of the Linux Foundation,” Tecton co-founder and CEO Mike Del Balso explained.

Tecton later hired Pienaar, who is today an engineer at the company where he leads their open source team. While the company did not originally start off with a plan to build an open source product, the two products are closely aligned, and it made sense to bring Pienaar on board.

“The products are very similar in a lot of ways. So I think there’s a similarity there that makes this somewhat symbiotic, and there is no explicit convergence necessary. The Tecton product is a superset of what Feast has. So it’s an enterprise version with a lot more advanced functionality, but at Feast we have a battle-tested feature store that’s open source,” Pienaar said.

As we wrote in a December 2020 story on the company’s $35 million Series B, it describes a feature store as “an end-to-end machine learning management system that includes the pipelines to transform the data into what are called feature values, then it stores and manages all of that feature data and finally it serves a consistent set of data.”

Del Balso says that from a business perspective, contributing to the open source feature store exposes his company to a different group of users, and the commercial and open source products can feed off one another as they build the two products.

“What we really like, and what we feel is very powerful here, is that we’re deeply in the Feast community and get to learn from all of the interesting use cases […] to improve the Tecton product. And similarly, we can use the feedback that we’re hearing from our enterprise customers to improve the open source project. That’s the kind of cross learning, and ideally that feedback loop involved there,” he said.

The plan is for Tecton to continue being a primary contributor with a team inside Tecton dedicated to working on Feast. Today, the company is releasing version 0.10 of the project.


By Ron Miller

Docugami’s new model for understanding documents cuts its teeth on NASA archives

You hear so much about data these days that you might forget that a huge amount of the world runs on documents: a veritable menagerie of heterogeneous files and formats holding enormous value yet incompatible with the new era of clean, structured databases. Docugami plans to change that with a system that intuitively understands any set of documents and intelligently indexes their contents — and NASA is already on board.

If Docugami’s product works as planned, anyone will be able to take piles of documents accumulated over the years and near-instantly convert them to the kind of data that’s actually useful to people.

Because it turns out that running just about any business ends up producing a ton of documents. Contracts and briefs in legal work, leases and agreements in real estate, proposals and releases in marketing, medical charts, etc, etc. Not to mention the various formats: Word docs, PDFs, scans of paper printouts of PDFs exported from Word docs, and so on.

Over the last decade there’s been an effort to corral this problem, but movement has largely been on the organizational side: put all your documents in one place, share and edit them collaboratively. Understanding the document itself has pretty much been left to the people who handle them, and for good reason — understanding documents is hard!

Think of a rental contract. We humans understand when the renter is named as Jill Jackson, that later on, “the renter” also refers to that person. Furthermore, in any of a hundred other contracts, we understand that the renters in those documents are the same type of person or concept in the context of the document, but not the same actual person. These are surprisingly difficult concepts for machine learning and natural language understanding systems to grasp and apply. Yet if they could be mastered, an enormous amount of useful information could be extracted from the millions of documents squirreled away around the world.

What’s up, .docx?

Docugami founder Jean Paoli says they’ve cracked the problem wide open, and while it’s a major claim, he’s one of few people who could credibly make it. Paoli was a major figure at Microsoft for decades, and among other things helped create the XML format — you know all those files that end in x, like .docx and .xlsx? Paoli is at least partly to thank for them.

“Data and documents aren’t the same thing,” he told me. “There’s a thing you understand, called documents, and there’s something that computers understand, called data. Why are they not the same thing? So my first job [at Microsoft] was to create a format that can represent documents as data. I created XML with friends in the industry, and Bill accepted it.” (Yes, that Bill.)

The formats became ubiquitous, yet 20 years later the same problem persists, having grown in scale with the digitization of industry after industry. But for Paoli the solution is the same. At the core of XML was the idea that a document should be structured almost like a webpage: boxes within boxes, each clearly defined by metadata — a hierarchical model more easily understood by computers.

Illustration showing a document corresponding to pieces of another document.

Image Credits: Docugami

“A few years ago I drank the AI kool-aid, got the idea to transform documents into data. I needed an algorithm that navigates the hierarchical model, and they told me that the algorithm you want does not exist,” he explained. “The XML model, where every piece is inside another, and each has a different name to represent the data it contains — that has not been married to the AI model we have today. That’s just a fact. I hoped the AI people would go and jump on it, but it didn’t happen.” (“I was busy doing something else,” he added, to excuse himself.)

The lack of compatibility with this new model of computing shouldn’t come as a surprise — every emerging technology carries with it certain assumptions and limitations, and AI has focused on a few other, equally crucial areas like speech understanding and computer vision. The approach taken there doesn’t match the needs of systematically understanding a document.

“Many people think that documents are like cats. You train the AI to look for their eyes, for their tails… documents are not like cats,” he said.

It sounds obvious, but it’s a real limitation: advanced AI methods like segmentation, scene understanding, multimodal context, and such are all a sort of hyper-advanced cat detection that has moved beyond cats to detect dogs, car types, facial expressions, locations, etc. Documents are too different from one another, or in other ways too similar, for these approaches to do much more than roughly categorize them.

And as for language understanding, it’s good in some ways but not in the ways Paoli needed. “They’re working sort of at the English language level,” he said. “They look at the text but they disconnect it from the document where they found it. I love NLP people, half my team is NLP people — but NLP people don’t think about business processes. You need to mix them with XML people, people who understand computer vision, then you start looking at the document at a different level.”

Docugami in action

Illustration showing a person interacting with a digital document.

Image Credits: Docugami

Paoli’s goal couldn’t be reached by adapting existing tools (beyond mature primitives like optical character recognition), so he assembled his own private AI lab, where a multi-disciplinary team has been tinkering away for about two years.

“We did core science, self-funded, in stealth mode, and we sent a bunch of patents to the patent office,” he said. “Then we went to see the VCs, and Signalfire basically volunteered to lead the seed round at $10 million.”

Coverage of the round didn’t really get into the actual experience of using Docugami, but Paoli walked me through the platform with some live documents. I wasn’t given access myself and the company wouldn’t provide screenshots or video, saying it is still working on the integrations and UI, so you’ll have to use your imagination… but if you picture pretty much any enterprise SaaS service, you’re 90 percent of the way there.

As the user, you upload any number of documents to Docugami, from a couple dozen to hundreds or thousands. These enter a machine understanding workflow that parses the documents, whether they’re scanned PDFs, Word files, or something else, into an XML-esque hierarchical organization unique to the contents.

“Say you’ve got 500 documents, we try to categorize it in document sets, these 30 look the same, those 20 look the same, those 5 together. We group them with a mix of hints coming from how the document looked, what it’s talking about, what we think people are using it for, etc,” said Paoli. Other services might be able to tell the difference between a lease and an NDA, but documents are too diverse to slot into pre-trained ideas of categories and expect it to work out. Every set of documents is potentially unique, and so Docugami trains itself anew every time, even for a set of one. “Once we group them, we understand the overall structure and hierarchy of that particular set of documents, because that’s how documents become useful: together.”

Illustration showing a document being turned into a report and a spreadsheet.

Image Credits: Docugami

That doesn’t just mean it picks up on header text and creates an index, or lets you search for words. The data that is in the document, for example who is paying whom, how much and when, and under what conditions, all that becomes structured and editable within the context of similar documents. (It asks for a little input to double check what it has deduced.)

It can be a little hard to picture, but now just imagine that you want to put together a report on your company’s active loans. All you need to do is highlight the information that’s important to you in an example document — literally, you just click “Jane Roe” and “$20,000” and “5 years” anywhere they occur — and then select the other documents you want to pull corresponding information from. A few seconds later you have an ordered spreadsheet with names, amounts, dates, anything you wanted out of that set of documents.

All this data is meant to be portable too, of course — there are integrations planned with various other common pipes and services in business, allowing for automatic reports, alerts if certain conditions are reached, automated creation of templates and standard documents (no more keeping an old one around with underscores where the principals go).

Remember, this is all half an hour after you uploaded them in the first place, no labeling or pre-processing or cleaning required. And the AI isn’t working from some preconceived notion or format of what a lease document looks like. It’s learned all it needs to know from the actual docs you uploaded — how they’re structured, where things like names and dates figure relative to one another, and so on. And it works across verticals and uses an interface anyone can figure out a few minutes. Whether you’re in healthcare data entry or construction contract management, the tool should make sense.

The web interface where you ingest and create new documents is one of the main tools, while the other lives inside Word. There Docugami acts as a sort of assistant that’s fully aware of every other document of whatever type you’re in, so you can create new ones, fill in standard information, comply with regulations, and so on.

Okay, so processing legal documents isn’t exactly the most exciting application of machine learning in the world. But I wouldn’t be writing this (at all, let alone at this length) if I didn’t think this was a big deal. This sort of deep understanding of document types can be found here and there among established industries with standard document types (such as police or medical reports), but have fun waiting until someone trains a bespoke model for your kayak rental service. But small businesses have just as much value locked up in documents as large enterprises — and they can’t afford to hire a team of data scientists. And even the big organizations can’t do it all manually.

NASA’s treasure trove

Image Credits: NASA

The problem is extremely difficult, yet to humans seems almost trivial. You or I could glance through 20 similar documents and a list of names and amounts easily, perhaps even in less time than it takes for Docugami to crawl them and train itself.

But AI, after all, is meant to imitate and excel human capacity, and it’s one thing for an account manager to do monthly reports on 20 contracts — quite another to do a daily report on a thousand. Yet Docugami accomplishes the latter and former equally easily — which is where it fits into both the enterprise system, where scaling this kind of operation is crucial, and to NASA, which is buried under a backlog of documentation from which it hopes to glean clean data and insights.

If there’s one thing NASA’s got a lot of, it’s documents. Its reasonably well maintained archives go back to its founding, and many important ones are available by various means — I’ve spent many a pleasant hour perusing its cache of historical documents.

But NASA isn’t looking for new insights into Apollo 11. Through its many past and present programs, solicitations, grant programs, budgets, and of course engineering projects, it generates a huge amount of documents — being, after all, very much a part of the federal bureaucracy. And as with any large organization with its paperwork spread over decades, NASA’s document stash represents untapped potential.

Expert opinions, research precursors, engineering solutions, and a dozen more categories of important information are sitting in files searchable perhaps by basic word matching but otherwise unstructured. Wouldn’t it be nice for someone at JPL to get it in their head to look at the evolution of nozzle design, and within a few minutes have a complete and current list of documents on that topic, organized by type, date, author, and status? What about the patent advisor who needs to provide a NIAC grant recipient information on prior art — shouldn’t they be able to pull those old patents and applications up with more specificity than any with a given keyword?

The NASA SBIR grant, awarded last summer, isn’t for any specific work, like collecting all the documents of such and such a type from Johnson Space Center or something. It’s an exploratory or investigative agreement, as many of these grants are, and Docugami is working with NASA scientists on the best ways to apply the technology to their archives. (One of the best applications may be to the SBIR and other small business funding programs themselves.)

Another SBIR grant with the NSF differs in that, while at NASA the team is looking into better organizing tons of disparate types of documents with some overlapping information, at NSF they’re aiming to better identify “small data.” “We are looking at the tiny things, the tiny details,” said Paoli. “For instance, if you have a name, is it the lender or the borrower? The doctor or the patient name? When you read a patient record, penicillin is mentioned, is it prescribed or prohibited? If there’s a section called allergies and another called prescriptions, we can make that connection.”

“Maybe it’s because I’m French”

When I pointed out the rather small budgets involved with SBIR grants and how his company couldn’t possibly survive on these, he laughed.

“Oh, we’re not running on grants! This isn’t our business. For me, this is a way to work with scientists, with the best labs in the world,” he said, while noting many more grant projects were in the offing. “Science for me is a fuel. The business model is very simple – a service that you subscribe to, like Docusign or Dropbox.”

The company is only just now beginning its real business operations, having made a few connections with integration partners and testers. But over the next year it will expand its private beta and eventually open it up — though there’s no timeline on that just yet.

“We’re very young. A year ago we were like five, six people, now we went and got this $10M seed round and boom,” said Paoli. But he’s certain that this is a business that will be not just lucrative but will represent an important change in how companies work.

“People love documents. Maybe it’s because I’m French,” he said, “but I think text and books and writing are critical — that’s just how humans work. We really think people can help machines think better, and machines can help people think better.”


By Devin Coldewey

Microsoft goes all in on healthcare with $19.7B Nuance acquisition

When Microsoft announced it was acquiring Nuance Communications this morning for $19.7 billion, you could be excused for doing a Monday morning double take at the hefty price tag.

That’s surely a lot of money for a company on a $1.4 billion run rate, but Microsoft, which has already partnered with the speech-to-text market leader on several products over the last couple of years, saw a company firmly embedded in healthcare and it decided to go all in.

And $20 billion is certainly all in, even for a company the size of Microsoft. But 2020 forced us to change the way we do business from restaurants to retailers to doctors. In fact, the pandemic in particular changed the way we interact with our medical providers. We learned very quickly that you don’t have to drive to an office, wait in waiting room, then in an exam room, all to see the doctor for a few minutes.

Instead, we can get on the line, have a quick chat and be on our way. It won’t work for every condition of course — there will always be times the physician needs to see you — but for many meetings such as reviewing test results or for talk therapy, telehealth could suffice.

Microsoft CEO Satya Nadella says that Nuance is at the center of this shift, especially with its use of cloud and artificial intelligence, and that’s why the company was willing to pay the amount it did to get it.

“AI is technology’s most important priority, and healthcare is its most urgent application. Together, with our partner ecosystem, we will put advanced AI solutions into the hands of professionals everywhere to drive better decision-making and create more meaningful connections, as we accelerate growth of Microsoft Cloud in Healthcare and Nuance,” Nadella said in a post announcing the deal.

Microsoft sees this deal doubling what was already a considerable total addressable market to nearly $500 billion. While TAMs always tend to run high, that is still a substantial number.

It also fits with Gartner data, which found that by 2022, 75% of healthcare organizations will have a formal cloud strategy in place. The AI component only adds to that number and Nuance brings 10,000 existing customers to Microsoft including some of the biggest healthcare organizations in the world.

Brent Leary, founder and principal analyst at CRM Essentials, says the deal could provide Microsoft with a ton of health data to help feed the underlying machine learning models and make them more accurate over time.

“There is going be a ton of health data being captured by the interactions coming through telemedicine interactions, and this could create a whole new level of health intelligence,” Leary told me.

That of course could drive a lot of privacy concerns where health data is involved, and it will be up to Microsoft, which just experienced a major breach on its Exchange email server products last month, to assure the public that their sensitive health data is being protected.

Leary says that ensuring data privacy is going to be absolutely key to the success of the deal. “The potential this move has is pretty powerful, but it will only be realized if the data and insights that could come from it are protected and secure — not only protected from hackers but also from unethical use. Either could derail what could be a game changing move,” he said.

Microsoft also seemed to recognize that when it wrote, “Nuance and Microsoft will deepen their existing commitments to the extended partner ecosystem, as well as the highest standards of data privacy, security and compliance.”

We are clearly on the edge of a sea change when it comes to how we interact with our medical providers in the future. COVID pushed medicine deeper into the digital realm in 2020 out of simple necessity. It wasn’t safe to go into the office unless absolutely necessary.

The Nuance acquisition, which is expected to close some time later this year, could help Microsoft shift deeper into the market. It could even bring Teams into it as a meeting tool, but it’s all going to depend on the trust level people have with this approach, and it will be up to the company to make sure that both healthcare providers and the people they serve have that.


By Ron Miller

Microsoft is acquiring Nuance Communications for $19.7B

Microsoft agreed today to acquire Nuance Communications, a leader in speech to text software, for $19.7 billion. Bloomberg broke the story over the weekend that the two companies were in talks.

In a post announcing the deal, the company said this was about increasing its presence in the healthcare vertical, a place where Nuance has done well in recent years. In fact, the company announced the Microsoft Cloud for Healthcare last year, and this deal is about accelerating its presence there. Nuance’s products in this area include Dragon Ambient eXperience, Dragon Medical One and PowerScribe One for radiology reporting.

“Today’s acquisition announcement represents the latest step in Microsoft’s industry-specific cloud strategy,” the company wrote. The acquisition also builds on several integrations and partnerships the two companies have made in the last couple of years.

The company boasts 10,000 healthcare customers, according to information on the website. Those include AthenaHealth, Johns Hopkins, Mass General Brigham and Cleveland Clinic to name but a few, and it was that customer base that attracted Microsoft to pay the price it did to bring Nuance into the fold.

Nuance CEO Mark Benjamin will remain with the company and report to Scott Guthrie, Microsoft’s EVP in charge of the cloud and AI group.

Nuance has a complex history. It went public in 2000 and began buying speech recognition products including Dragon Dictate from Lernout Hauspie in 2001. It merged with a company called ScanSoft in 2005. That company began life as Visioneer, a scanning company in 1992.

Today, the company has a number of products including Dragon Dictate, a consumer and business text to speech product that dates back to the early 1990s. It’s also involved in speech recognition, chat bots and natural language processing particularly in healthcare and other verticals.

The company has 6,000 employees spread across 27 countries. In its most recent earnings report from November 2020, which was for Q42020, the company reported $352.9 million in revenue compared to $387.6 million in the same period a year prior. That’s not the direction a company wants to go in, but it is still a run rate of over $1.4 billion.

At the time of that earnings call, the company also announced it was selling its medical transcription and electronic health record (EHR) Go-Live services to Assured Healthcare Partners and Aeries Technology Group. Company CEO Benjamin said this was about helping the company concentrate on its core speech services.

“With this sale, we will reach an important milestone in our journey towards a more focused strategy of advancing our Conversational AI, natural language understanding and ambient clinical intelligence solutions,” Benjamin said in a statement at the time.

It’s worth noting that Microsoft already has a number speech recognition and chat bot products of its own including desktop speech to text services in Windows and on Azure, but it took a chance to buy a market leader and go deeper into the healthcare vertical.

The transaction has already been approved by both company boards and Microsoft reports it expects the deal to close by the end of this year, subject to standard regulatory oversight and approval by Nuance shareholders.

This would mark the second largest purchase by Microsoft ever, only surpassed by the $26.2 billion the company paid for LinkedIn in 2016.


By Ron Miller

Immersion cooling to offset data centers’ massive power demands gains a big booster in Microsoft

LiquidStack does it. So does Submer. They’re both dropping servers carrying sensitive data into goop in an effort to save the planet. Now they’re joined by one of the biggest tech companies in the world in their efforts to improve the energy efficiency of data centers, because Microsoft is getting into the liquid-immersion cooling market.

Microsoft is using a liquid it developed in-house that’s engineered to boil at 122 degrees Fahrenheit (lower than the boiling point of water) to act as a heat sink, reducing the temperature inside the servers so they can operate at full power without any risks from overheating.

The vapor from the boiling fluid is converted back into a liquid through contact with a cooled condenser in the lid of the tank that stores the servers.

“We are the first cloud provider that is running two-phase immersion cooling in a production environment,” said Husam Alissa, a principal hardware engineer on Microsoft’s team for datacenter advanced development in Redmond, Washington, in a statement on the company’s internal blog. 

While that claim may be true, liquid cooling is a well-known approach to dealing with moving heat around to keep systems working. Cars use liquid cooling to keep their motors humming as they head out on the highway.

As technology companies confront the physical limits of Moore’s Law, the demand for faster, higher performance processors mean designing new architectures that can handle more power, the company wrote in a blog post. Power flowing through central processing units has increased from 150 watts to more than 300 watts per chip and the GPUs responsible for much of Bitcoin mining, artificial intelligence applications and high end graphics each consume more than 700 watts per chip.

It’s worth noting that Microsoft isn’t the first tech company to apply liquid cooling to data centers and the distinction that the company uses of being the first “cloud provider” is doing a lot of work. That’s because bitcoin mining operations have been using the tech for years. Indeed, LiquidStack was spun out from a bitcoin miner to commercialize its liquid immersion cooling tech and bring it to the masses.

“Air cooling is not enough”

More power flowing through the processors means hotter chips, which means the need for better cooling or the chips will malfunction.

“Air cooling is not enough,” said Christian Belady, vice president of Microsoft’s datacenter advanced development group in Redmond, in an interview for the company’s internal blog. “That’s what’s driving us to immersion cooling, where we can directly boil off the surfaces of the chip.”

For Belady, the use of liquid cooling technology brings the density and compression of Moore’s Law up to the datacenter level

The results, from an energy consumption perspective, are impressive. The company found that using two-phase immersion cooling reduced power consumption for a server by anywhere from 5 percent to 15 percent (every little bit helps).

Microsoft investigated liquid immersion as a cooling solution for high performance computing applications such as AI. Among other things, the investigation revealed that two-phase immersion cooling reduced power consumption for any given server by 5% to 15%. 

Meanwhile, companies like Submer claim they reduce energy consumption by 50%, water use by 99%, and take up 85% less space.

For cloud computing companies, the ability to keep these servers up and running even during spikes in demand, when they’d consume even more power, adds flexibility and ensures uptime even when servers are overtaxed, according to Microsoft.

“[We] know that with Teams when you get to 1 o’clock or 2 o’clock, there is a huge spike because people are joining meetings at the same time,” Marcus Fontoura, a vice president on Microsoft’s Azure team, said on the company’s internal blog. “Immersion cooling gives us more flexibility to deal with these burst-y workloads.”

At this point, data centers are a critical component of the internet infrastructure that much of the world relies on for… well… pretty much every tech-enabled service. That reliance however has come at a significant environmental cost.

“Data centers power human advancement. Their role as a core infrastructure has become more apparent than ever and emerging technologies such as AI and IoT will continue to drive computing needs. However, the environmental footprint of the industry is growing at an alarming rate,” Alexander Danielsson, an investment manager at Norrsken VC noted last year when discussing that firm’s investment in Submer.

Solutions under the sea

If submerging servers in experimental liquids offers one potential solution to the problem — then sinking them in the ocean is another way that companies are trying to cool data centers without expending too much power.

Microsoft has already been operating an undersea data center for the past two years. The company actually trotted out the tech as part of a push from the tech company to aid in the search for a COVID-19 vaccine last year.

These pre-packed, shipping container-sized data centers can be spun up on demand and run deep under the ocean’s surface for sustainable, high-efficiency and powerful compute operations, the company said.

The liquid cooling project shares most similarity with Microsoft’s Project Natick, which is exploring the potential of underwater datacenters that are quick to deploy and can operate for years on the seabed sealed inside submarine-like tubes without any onsite maintenance by people. 

In those data centers nitrogen air replaces an engineered fluid and the servers are cooled with fans and a heat exchanger that pumps seawater through a sealed tube.

Startups are also staking claims to cool data centers out on the ocean (the seaweed is always greener in somebody else’s lake).

Nautilus Data Technologies, for instance, has raised over $100 million (according to Crunchbase) to develop data centers dotting the surface of Davey Jones’ Locker. The company is currently developing a data center project co-located with a sustainable energy project in a tributary near Stockton, Calif.

With the double-immersion cooling tech Microsoft is hoping to bring the benefits of ocean-cooling tech onto the shore. “We brought the sea to the servers rather than put the datacenter under the sea,” Microsoft’s Alissa said in a company statement.

Ioannis Manousakis, a principal software engineer with Azure (left), and Husam Alissa, a principal hardware engineer on Microsoft’s team for datacenter advanced development (right), walk past a container at a Microsoft datacenter where computer servers in a two-phase immersion cooling tank are processing workloads. Photo by Gene Twedt for Microsoft.


By Jonathan Shieber

Blue dot raises $32M for AI that helps businesses manage their tax accounting

Artificial intelligence has become a fundamental cornerstone of how a lot of business software works, providing a useful boost in reading, understanding, and using the often-fragmented trove of data that organizations generate these days. In the latest development, an Israeli startup called Blue dot, which uses AI to help companies handle their tax accounting, is announcing $32 million in funding to continue its growth, specifically addressing the demand from companies for more user-friendly tools to help read and correctly itemize expenses for tax purposes.

“The tax sector is very complicated, and we are playing in a very large space, but it’s a huge revolution,” Blue dot’s CEO and co-founder Isaac Saft said in an interview. “Business and enterprise accounting is just not going to look the same in the future as it does today.”

The funding is being led by Ibex Investors in partnership with Lutetia Technology Partners, with past investors Lamaison Partners, Viola and Target Global also contributing. Blue dot rebranded only last week from its original name, VATBox (part of the funding will be used to help Blue dot move deeper into the U.S. market, where the concept of VAT is not quite so ubiquitous: there is no national sales tax and states determine the rates themselves).

Pitchbook notes that under its previous name, the startup last raised money in 2017, a $20 million Series B led by Viola at a $120 million post-money valuation.

While Blue dot is not disclosing valuation today, it’s likely to be significantly higher than this based on some of its engagements. In addition to customers like Amazon, tobacco giant BAT and Dell, it also has a partnership with one of the bigger names in expense accounting, SAP Concur, which uses Blue dot to power its expense data entry tool to automatically read charges and figure out how to itemize them so that employees or accountants don’t need to go through the pain of that themselves.

As Saft describes it, part of what is propelling his company’s business is the bigger trend of consumerization and the role that it has played in enterprise services: the working world has picked up a lot of technology tools, led by the smartphone, to help them organize their personal lives, and a lot of what they are being “served” through technology is increasingly personalized with lower barriers of entry, whether its on e-commerce sites, entertainment or social media. In the working world, they can often be frustrated as a result with how much work something like expenses can involve — a process that gets ever more complicated the more strict tax regimes become.

Blue dot’s approach is to essentially view the tax accounting process as something that can be improved with AI to make it easier for people to use — whether those people are workers itemizing their expenses, or accounts auditing them and running those through even bigger accounting processes. With a machine learning system that both takes into account a company’s own internal compliance and company policies, and the wider tax and regulatory framework, Blue dot helps “read” an expense and figure out how to notate it, how much tax should be accounted and where, and so on.

This is especially important as the process of entering and managing expenses gets pushed out to the people spending the money, rather than dedicated accountants handling that work on their behalf. An awareness of how modern offices are functioning today and evolving is one reason why investors were interested here.

“We believe Blue dot can change the way organizations worldwide manage accounting and its tax implications for their expenses,” Gal Gitter, a partner at Ibex, said in a statement. “There’s been a major market shift away from centralization of enterprise functions, including procurement. As that accelerates, more companies will be looking for ways to replace costly and complex manual processes with digital, automated solutions that use data and AI to essentially enable transactions to report themselves, which Blue dot delivers.”


By Ingrid Lunden

Moveworks expands IT chatbot platform to encompass entire organization

When investors gave Moveworks a hefty $75 million Series B at the end of 2019, they were investing in a chatbot startup that to that point had been tuned to answer IT help question in an automated way. Today, the company announced it had used that money to expand the platform to encompass employee questions across all lines of business.

At the time of that funding, nobody could have anticipated a pandemic either, but throughout last year as companies moved to work from home, having an automated systems in place like Moveworks became even more crucial, says CEO and company co-founder Bhavin Shah.

“It was a tragic year on a variety of fronts, but what it did was it coalesced a lot of energy around people’s need for support, people’s need for speed and help,” Shah said. It helps that employees typically access the Moveworks chatbot inside collaboration tools like Slack or Microsoft Teams, and people have been spending more time in these tools while working at home.

“We definitely saw a lot more interest in the market, and part of that was fueled by the large scale adoption of collaboration tools like Slack and Microsoft Teams by enterprises around the world,” he said.

The company is working with 100 large enterprise customers today, and those customers were looking for a more automated way for employees to ask questions about a variety of tooling from HR to finance and facilities management. While Shah says expanding the platform to move beyond IT into other parts of an organization had been on the roadmap, the pandemic definitely underscored the need to expand even more.

While the company spent its first several years tuning the underlying artificial intelligence technology for IT language, they had built it with expansion in mind. “We learned how to build a conversational system so that it can be dynamic and not be predicated on some person’s forethought around [what the question and answer will be] — that approach doesn’t scale. So there were a lot of things around dealing with all these enterprise resources and so forth that really prepared us to be an enterprise-wide partner,” Shah said.

The company also announced a new communications tool that enables companies to use the Moveworks bot to communicate directly with employees to get them to take some action. Shah says companies usually send out an email that for example, employees have to update their password. The bot tells you it’s time to do that and provides a link to walk you through the process. He says that beta testers have seen a 70% increase in responses using the bot to communicate about an action instead of email.

Shah recognizes that a technology that understands language is going to have a lot of cultural variances and nuances and that requires a diverse team to build a tool like this. He says that his HR team has a set of mandates to make sure they are interviewing people in under-represented roles to build a team that reflects the needs of the customer base and the world at large.

The company has been working with about a dozen customers over the last 9 months on the platform expansion, iterating with these customers to improve the quality of the responses, regardless of the type of question or which department it involves. Today, these tools are generally available.


By Ron Miller

Feedzai raises $200M at a $1B+ valuation for AI tools to fight financial fraud

On the heels of Jumio announcing a $150 million injection this week to continue building out its AI-based ID verification and anti-money laundering platform, another startup in the space is levelling up. Feedzai, which provides banks, others in the financial sector, and any company managing payments online with AI tools to spot and fight fraud — its cornerstone service involves super quick (3 millisecond) checks happening in the background while transactions are being made — has announced a Series D of $200 million. It said that the new financing is being made at a valuation of over $1 billion.

The round is being led by KKR, with Sapphire Ventures and strategic backer Citi Ventures — both past investors — also participating. Feedzai said it will be using the funds for further R&D and product development, to expand into more markets outside the U.S. — it was originally founded in Portugal but now is based out of San Mateo — and towards business development, specifically via partnerships to integrate and sell its tools.

One of those partners looks to be Citi itself:

“Citi is committed to advancing global payments anchored on transparency, efficiency, and control, and our partnership with Feedzai is allowing us to provide customers with technology that seamlessly balances agility and security,” said Manish Kohli, Global Head of Payments and Receivables, with Citi’s Treasury and Trade Solutions, in a statement.

The funding is coming at a time when the need for fraud protection for those managing transactions online has reached a high watermark, leading to a rush of customers for companies in the field.

Feezai says that its customers include 4 of the 5 largest banks in North America, 80% of the world’s Fortune 500 companies, 154 million individual and business taxpayers in the U.S., and has processed $9 billion in online transactions for 2 of the world’s most valuable athletic brands. In total its reach covers some 800 million customers of businesses that use its services.

In addition to Citibank, its customers include Fiserv, Santander, SoFi, and Standard Chartered’s Mox.

The round comes nearly four years after Feedzai raised its Series C, a $50 million round led by an unnamed investor and with an undisclosed valuation. Sapphire also participated in that round.

While money laundering, fraud and other kinds of illicit financial activity were already problems then, in the interim, the problem has only compounded, not least because of how much activity has shifted online, accelerating especially in the last year of pandemic-driven lockdowns. That’s been exacerbated also by a general rise in cybercrime — of which financial fraud remains the biggest component and motivator.

Within that bigger trend, solutions based on artificial intelligence have really emerged as critical to the task of identifying and fighting those illicit activities. Not only is that because AI solutions are able to make calculations and take actions and simply process more than non-AI based tools, or humans for that matter, but they are then able to go head to head with much of the fraud taking place, which itself is being built out on AI-based platforms and requires more sophistication to identify and combat.

For banking customers, Feedzai’s approach has been disruptive in part because of how it has conceived of the problem: it has built solutions that can be used across different scenarios, making them more powerful since the AI system is subsequently “learning” from more data. This is in contrast to how many financial service providers had conceived and tackled the issue in the past.

“Until now banks have used solutions based on verticals,” Nuno Sebastiao, co-founder and CEO of Feedzai, said in the past to TechCrunc. “The fraud solution you have for an ATM wouldn’t be the same fraud solution you would use for online banking which wouldn’t be the same fraud solution would have for a voice call center.” As these companies have refreshed their systems, many have taken a more agnostic approach like the kind the Feedzai has built.

The scale of the issue is clear, and unfortunately also something many of us have experienced first-hand. Feedzai says its data indicates that the last quarter of 2020 that show consumers saw a 650% increase in account takeover scams, a 600% in impersonation scams, and a 250% increase in online banking fraud attacks versus the first quarter of 2020.  (Those periods are, essentially, before pandemic and during pandemic comparisons.)

“The past 12 months have accelerated the world’s dependency on electronic financial services – from online banking to mobile payments, and in turn have increased fraud and money laundering activity. Our services are in more demand than ever,” said Sebastiao in a statement today.

Indeed, yesterday, when I covered Jumio’s $150 million round, I said I wouldn’t consider its funding to be an outlier (even though Jumio made clear it was the largest funding to date in its space): the fast follow from Feedzai, with an even higher amount of financing, really does underscore the trend at the moment.

In addition to these two, one of Feedzai’s biggest competitors, Kount, was acquired by credit ratings giant Equifax earlier this year for $640 million to move deeper into the space. (And related to that field, in the area of identity management, which goes hand-in-hand with tools for laundering and fraud, Okta acquired Auth0 for $6.5 billion.)

Other big rounds for startups in the wider space have included included ForgeRock ($96 million round), Onfido ($100 million), Payfone ($100 million), ComplyAdvantage ($50 million), Ripjar ($36.8 million) Truework ($30 million), Zeotap ($18 million) and Persona ($17.5 million).

KKR’s involvement in this round is notable as another example of a private equity firm getting in earlier with venture rounds with fast-scaling startups, similar to Great Hill’s investment in Jumio yesterday and a number of other examples. The firm says it’s making this investment out of its Next Generation Technology Growth Fund II, which is focused on making growth equity investment opportunities in the technology space.

“Feedzai offers a powerful solution to one of the biggest challenges we are facing today: financial crime in the digital age. Global commerce depends on future-proof technologies capable of dealing with a rapidly evolving threat landscape. At the same time, consumers rightfully demand a great customer experience, in addition to strong security layers when using banking or payments services,” said Stephen Shanley, Managing Director at KKR, in a statement

“We believe Feedzai’s platform uniquely meets these expectations and more, and we are looking forward to working with Nuno and the rest of the team to expand their offering even further,” added Spencer Chavez, Principal at KKR.


By Ingrid Lunden

Dataminr raises $475M on a $4.1B valuation for real-time insights based on 100k sources of public data

Significant funding news today for one of the startups making a business out of tapping huge, noisy troves of publicly available data across social media, news sites, undisclosed filings and more. Dataminr, which ingests information from a mix of 100,000 public data sources, and then based on that provides customers real-time insights into ongoing events and new developments, has closed on $475 million in new funding. Dataminr has confirmed that this Series F values the company at $4.1 billion as it gears up for an IPO in 2023.

This Series F is coming from a mix of investors including Eldridge (a firm that owns the LA Dodgers but also makes a bunch of other sports, media, tech and other investments), Valor Equity Partners (the firm behind Tesla and many tech startups), MSD Capital (Michael Dell’s fund), Reinvent Capital (Mark Pincus and Reid Hoffman’s firm), ArrowMark Partners, IVP, Eden Global and investment funds managed by Morgan Stanley Tactical Value, among others.

To put its valuation into some context, the New York-based company last raised money in 2018 at a $1.6 billion valuation. And with this latest round, it has now raised over $1 billion in outside funding, based on PitchBook data. This latest round has been in the works for a while and was rumored last week at a lower valuation than what Dataminr ultimately got.

The funding is coming at a critical moment, both for the company and for the world at large.

In terms of the company, Dataminr has been seeing a huge surge of business.

Ted Bailey, the founder and CEO, said in an interview that it will be using the money to continue growing its business in existing areas: adding more corporate customers, expanding in international sales and expanding its AI platform as it gears up for an IPO, most likely in 2023. In addition to being used journalists and newsrooms, NGOs and other public organizations, its corporate business today, Bailey said, includes half of the Fortune 50 and a number of large public sector organizations. Over the last year that large enterprise segment of its customers doubled in revenue growth.

“Whether it’s for physical safety, reputation risk or crisis management, or business intelligence or cybersecurity, we’re providing critical insights on a daily basis,” he said. “All of the events of the recent year have created a sense of urgency, and demand has really surged.”

Activity on the many platforms that Dataminr taps to ingest information has been on the rise for years, but it has grown exponentially in the last year especially as more people spend more time at home and online and away from physically interacting with each other: that means more data for Dataminr to crawl, but also, quite possibly, more at stake for all of us as a result: there is so much more out there than before, and as a result so much more to be gleaned out of that information.

That also means that the wider context of Dataminr’s growth is not quite so clear cut.

The company’s data tools have indeed usefully helped first responders react in crisis situations, feeding them data faster than even their own channels might do; and it provides a number of useful, market-impacting insights to businesses.

But Dataminr’s role in helping its customers — which include policing forces — connect the dots on certain issues has not always been seen as a positive. One controversial accusation made last year was that Dataminr data was being used by police for racial profiling. In years past, it has been barred by specific partners like Twitter from sharing data with intelligence agencies. Twitter used to be a 5% shareholder in the company. Bailey confirmed to me that it no longer is but remains a key partner for data. I’ve contacted Twitter to see if I can get more detail on this and will update the story if and when I learn more. Twitter made $509 million in revenues from services like data licensing in 2020, up by about $45 million on the year before.

In defense of Dataminr, Bailey that the negative spins on what it does result from “misperceptions,” since it can’t track people or do anything proactive. “We deliver alerts on events and it’s [about] a time advantage,” he said, likening it to the Associated Press, but “just earlier.”

“The product can’t be used for surveillance,” Bailey added. “It is prohibited.”

Of course, in the ongoing debate about surveillance, it’s more about how Dataminr’s customers might ultimately use the data that they get through Dataminr’s tools, so the criticism is more about what it might enable rather than what it does directly.

Despite some of those persistent questions about the ethics of AI and other tools and how they are implemented by end users, backers are bullish on the opportunities for Dataminr to continue growing.

Eden Global Partners served as strategic partner for the Series F capital round.


Early Stage is the premier ‘how-to’ event for startup entrepreneurs and investors. You’ll hear first-hand how some of the most successful founders and VCs build their businesses, raise money and manage their portfolios. We’ll cover every aspect of company-building: Fundraising, recruiting, sales, product market fit, PR, marketing and brand building. Each session also has audience participation built-in – there’s ample time included for audience questions and discussion. Use code “TCARTICLE” at checkout to get 20 percent off tickets right here.


By Ingrid Lunden

TigerGraph raises $105M Series C for its enterprise graph database

TigerGraph, a well-funded enterprise startup that provides a graph database and analytics platform, today announced that it has raised a $105 million Series C funding round. The round was led by Tiger Global and brings the company’s total funding to over $170 million.

“TigerGraph is leading the paradigm shift in connecting and analyzing data via scalable and native graph technology with pre-connected entities versus the traditional way of joining large tables with rows and columns,” said TigerGraph found and CEO, Yu Xu. “This funding will allow us to expand our offering and bring it to many more markets, enabling more customers to realize the benefits of graph analytics and AI.”

Current TigerGraph customers include the likes of Amgen, Citrix, Intuit, Jaguar Land Rover and UnitedHealth Group. Using a SQL-like query language (GSQL), these customers can use the company’s services to store and quickly query their graph databases. At the core of its offerings is the TigerGraphDB database and analytics platform, but the company also offers a hosted service, TigerGraph Cloud, with pay-as-you-go pricing, hosted either on AWS or Azure. With GraphStudio, the company also offers a graphical UI for creating data models and visually analyzing them.

The promise for the company’s database services is that they can scale to tens of terabytes of data with billions of edges. Its customers use the technology for a wide variety of use cases, including fraud detection, customer 360, IoT, AI, and machine learning.

Like so many other companies in this space, TigerGraph is facing some tailwind thanks to the fact that many enterprises have accelerated their digital transformation projects during the pandemic.

“Over the last 12 months with the COVID-19 pandemic, companies have embraced digital transformation at a faster pace driving an urgent need to find new insights about their customers, products, services, and suppliers,” the company explains in today’s announcement. “Graph technology connects these domains from the relational databases, offering the opportunity to shrink development cycles for data preparation, improve data quality, identify new insights such as similarity patterns to deliver the next best action recommendation.”


By Frederic Lardinois

Peak AI nabs $21M for a platform to help non-tech companies make AI-based decisions

One of the biggest challenges for organizations in modern times is deciding where, when, and how to use the advances of technology, when the organizations are not technology companies themselves. Today, a startup out of Manchester, England, is announcing some funding for a platform that it believes can help.

Peak AI, which has built technology that it says can help enterprises — specifically those that work with physical products such as retailers, consumer goods companies, and manufacturing organizations — make better, AI-based evaluations and decisions, has closed a round of $21 million.

The Series B is being led by Oxx, with participation from past investors MMC Ventures and Praetura Ventures, as well as new backer Arete. It has raised $43 million to date and is not disclosing its valuation.

Richard Potter, the CEO who co-founded the company with Atul Sharma and David Leitch, said that the funding will be used to continue expanding the the functionality of its platform, adding offices in the U.S. and India, and growing its customer base.

Its list of clients today is an impressive one, including the retailer PrettyLittleThing, KFC, PepsiCo, Marshalls and Speedy Hire.

As Potter describes it, Peak identified its opportunity early on. It was founded in 2014, a time non-tech enterprises were just starting to grasp how the concept of AI could apply to their businesses but felt it was out of their reach.

Indeed, the larger landscape for AI services at that time was largely one focused on technology companies, specifically companies like Google, Amazon and Apple that were building AI products to power their own services, and often snapping up the most interesting talent in the field as it manifested through smaller startups and universities.

Peak’s basic premise was to build AI not as a business goal for itself but as a business service. Its platform sits within an organization and ingests any data source that a company might wish to feed into it.

While initial integration needs technical know-how — either at the company itself or via a systems integrator — using Peak day-to-day can be done by both technical and non-technical workers.

Peak says it can help answer a variety of questions that those people might have, such as how much of an item to produce, and where to ship it, based on a complex mix of sales data; how to manage stock better; or when to ramp up or ramp down headcount in a warehouse. The platform can also be used to help companies with marketing and advertising, figuring out how to better target campaigns to the right audiences, and so on.

Peak is not the first company that has seized on the concept of using a “general” AI to give non-tech organizations the same kinds of superpowers that the likes of big tech now use in their own businesses everyday.

Sometimes the ambition has outstripped the returns, however.

Witness Element AI, a highly-touted startup backed by a long list of top-shelf strategic and financial investors to build, essentially, an AI services business for non-tech companies to use as they might these days use Accenture. It never quite got there, though, and was acquired by ServiceNow last year at a devalued price of $500 million, the customer deals it had were wound down, and the tech was integrated into the bigger company’s stack.

Other efforts within hugely successful tech companies have not fared that well either.

“Einsten’s features are essentially useless, and you can quote me on that,” said Potter of Salesforce’s in-house CRM AI business. “Because it is too generic, it doesn’t predict anything useful.”

And that is perhaps the crux of why Peak AI is working for now: it has remained focused for now on a limited number of segments of the market, in particular those with physical objects as the end product, giving the AI that it has built a more targeted end point. In other words, it’s “general” but only for specific industries.

And it claims that this is paying off. Peak’s customers have reported a 5% increase in total company revenues, a doubling of return on advertising spend, a 12% reduction in inventory holdings, and a 5% reduction in supply chain costs, according to the company (although it doesn’t specify which companies, which products, or anything that points to who or what is being described).

“Richard and the excellent Peak team have a compelling vision to optimize entire businesses through Decision Intelligence and they’re delivering real-world benefits to a raft of household name customers already,” said Richard Anton, a general partner at Oxx, in a statement. “The pandemic has meant digitization is no longer a choice; it’s a requirement. Peak has made it easier for businesses to get started and see rapid results from AI-enabled decision making. We are delighted to support Peak on their way to becoming the category-defining global leader in Decision Intelligence.” Anton is joining the board with this round.


By Ingrid Lunden

Base Operations raises $2.2 million to modernize physical enterprise security

Typically when we talk about tech and security, the mind naturally jumps to cybersecurity. But equally important, especially for global companies with large, multinational organizations, is physical security – a key function at most medium-to-large enterprises, and yet one that to date, hasn’t really done much to take advantage of recent advances in technology. Enter Base Operations, a startup founded by risk management professional Cory Siskind in 2018. Base Operations just closed their $2.2 million seed funding round, and will use the money to capitalize on its recent launch of a street-level threat mapping platform for use in supporting enterprise security operations.

The funding, led by Good Growth Capital and including investors like Magma Partners, First In Capital, Gaingels and First Round Capital founder Howard Morgan, will be used primarily for hiring, as Base Operations looks to continue its team growth after doubling its employe base this past month. It’ll also be put to use extending and improving the company’s product, and growing the startup’s global footprint. I talked to Siskind about her company’s plans on the heels of this round, as well as the wider opportunity and how her company is serving the market in a novel way.

“What we do at Base Operations is help companies keep their people in operation secure with ‘Micro Intelligence,’ which is street-level threat assessments that facilitate a variety of routine security tasks in the travel security, real estate and supply chain security buckets,” Siskind explained. “Anything that the Chief Security Officer would be in charge of, but not cyber – so anything that intersects with the physical world.”

Siskind has first-hand experience about the complexity and challenges that enter into enterprise security, since she began her career working for global strategic risk consultancy firm Control Risks in Mexico City. Because of her time in the industry, she’s keenly aware of just how far physical and political security operations lag behind their cybersecurity counterparts. It’s an often-overlooked aspect of corporate risk management, particularly since in the past it’s been something that most employees at North American companies only ever encounter periodically, when their roles involve frequent travel. The events of the past couple of years have changed that, however.

“This was the last bastion of a company that hadn’t been optimized by a SaaS platform, basically, so there was some resistance and some allegiance to legacy players,” Siskind told me. “However, the events of 2020 sort of turned everything on its head, and companies realized that the security department ,and what happens in the physical world, is not just about compliance – it’s actually a strategic advantage to invest in those sort of services, because it helps you maintain business continuity.”

The COVID-19 pandemic, increased frequency and severity of natural disasters, and global political unrest all had significant impact on businesses worldwide in 2020, and Siskind says that this has proven a watershed moment in how enterprises consider physical security in their overall risk profile and strategic planning cycles.

“[Companies] have just realized that if you don’t invest and how to keep your operations running smoothly in the face of rising catastrophic events, you’re never going to achieve the the profits that you need, because it’s too choppy, and you have all sorts of problems,” she said.

Base Operations addresses this problem by taking available data from a range of sources and pulling it together to inform threat profiles. Their technology is all about making sense of the myriad stream of information we encounter daily – taking the wash of news that we sometimes associate with ‘doom-scrolling’ on social media, for instance, and combining it with other sources using machine learning to extrapolate actionable insights.

Those sources of information include “government statistics, social media, local news, data from partnerships, like NGOs and universities,” Siskind said. That data set powers their Micro Intelligence platform, and while the startup’s focus today is on helping enterprises keep people safe, while maintaining their operations, you can easily see how the same information could power everything from planning future geographical expansion, to tailoring product development to address specific markets.

Siskind saw there was a need for this kind of approach to an aspect of business that’s essential, but that has been relatively slow to adopt new technologies. From her vantage point two years ago, however, she couldn’t have anticipated just how urgent the need for better, more scalable enterprise security solutions would arise, and Base Operations now seems perfectly positioned to help with that need.


By Darrell Etherington

Intenseye raises $4M to boost workplace safety through computer vision

Workplace injuries and illnesses cost the U.S. upwards of $250 billion each year, according to the Economic Policy Institute. ERA-backed startup Intenseye, a machine learning platform, has raised a $4 million seed round to try to bring that number way down in an economic and efficient way.

The round was co-led by Point Nine and Air Street Capital, with participation by angel investors from Twitter, Cortex, Fastly, and Even Financial.

Intenseye integrates with existing network-connected cameras within facilities and then uses computer vision to monitor employee health and safety on the job. This means that Intenseye can identify health and safety violations, from not wearing a hard hat to ignoring social distancing protocols and everything in between, in real time.

The service’s dashboard incorporates federal and local workplace safety laws, as well as an individual organization’s rules to monitor worker safety in real time. All told, the Intenseye platform can identify 30 different unsafe behaviors which are common within workplaces. Managers can further customize these rules using a drag-and-drop interface.

When a violation occurs and is spotted, employee health and safety professionals receive an alert immediately, by text or email, to resolve the issue.

Intenseye also takes the aggregate of workplace safety compliance within a facility to generate a compliance score and diagnose problem areas.

The company charges a base deployment fee and then on an annual fee based on the number of cameras the facility wants to use as Intenseye monitoring points.

Cofounder Sercan Esen says that one of the greatest challenges of the business is a technical one: Intenseye monitors workplace safety through computer vision to send EHS (employee health and safety) violation alerts but it also never analyzes faces or identifies individuals, and all video is destroyed on the fly and never stored with Intenseye.

The Intenseye team is made up of 20 people.

“Today, our team at Intenseye is 20% female and 80% male and includes 4 nationalities,” said Esen. “We have teammates with MSs in computer science and teammates who have graduated from high school.”

Diversity and inclusion among the team is critical at every company, but is particularly important at a company that builds computer vision software.

The company has moved to remote work in the wake of the pandemic and is using VR to build a virtual office and connect workers in a way that’s more immersive than Zoom.

Intenseye is currently deployed across 30 cities and will use the funding to build out the team, particularly in the sales and marketing departments, and deploy go-to-market strategies.


By Jordan Crook

SentinelOne to acquire high-speed logging startup Scalyr for $155M

SentinelOne, a late-stage security startup that helps customers make sense of security data using AI and machine learning, announced today that it is acquiring Scalyr, the high-speed logging startup for $155 million in stock and cash.

SentinelOne sorts through oodles of data to help customers understand their security posture, and having a tool that enables engineers to iterate rapidly in the data, and get to the root of the problem is going to be extremely valuable for them, CEO and co-founder Tomer Weingarten explained. “We thought Scalyr would be just an amazing fit to our continued vision in how we secure data at scale for every enterprise [customer] out there,” he told me.

He said they spent a lot of time shopping for a company that could meet their unique scaling needs and when they came across Scalyr, they saw the potential pretty quickly with a company that has built a real-time data lake. “When we look at the scale of our technology, we obviously scoured the world to find the best data analytics technology out there. We [believe] we found something incredibly special when we found a platform that can ingest data, and make it accessible in real time,” Weingarten explained.

He believes the real time element is a game changer because it enables customers to prevent breaches, rather than just reacting to them. “If you’re thinking about mitigating attacks or reacting to attacks, if you can do that in real time and you can process data in real time, and find the anomalies in real time and then meet them, you’re turning into a system that can actually deflect the attacks and not just see them and react to them,” he explained.

The company sees Scalyr as a product they can integrate into the platform, but also one which will remain a stand-alone. That means existing customers should be able to continue using Scalyr as before, while benefiting from having a larger company contributing to its R&D.

While SentinelOne is not a public company, it is a pretty substantial private one, having raised over $695 million, according to Crunchbase data. The company’s most recent funding round came in February last year, a $200 million investment with a $1.1 billion valuation.

As for Scalyr it was launched in 2011 by Steve Newman, who first built a word processor called Writely and sold it to Google in 2006. It was actually the basis for what became Google Docs. Newman stuck around and started building the infrastructure to scale Google Docs, and he used that experience and knowledge to build Scalyr. The startup raised $27 million along the way, according to Crunchbase data including a $20 million Series A investment in 2017.

The deal will close this quarter, and when it does Scalyr’s 45 employees will be joining SentinalOne.


By Ron Miller

Pinecone lands $10M seed for purpose-built machine learning database

Pinecone, a new startup from the folks who helped launch Amazon SageMaker, has built a vector database that generates data in a specialized format to help build machine learning applications faster, something that was previously only accessible to the largest organizations. Today the company came out of stealth with a new product and announced a $10 million seed investment led by Wing Venture Capital.

Company co-founder Edo Liberty says that he started the company because of this fundamental belief that the industry was being held back by the lack of wider access to this type of database. “The data that a machine learning model expects isn’t a JSON record, it’s a high dimensional vector that is either a list of features or what’s called an embedding that’s a numerical representation of the items or the objects in the world. This [format] is much more semantically rich and actionable for machine learning,” he explained.

He says that this is a concept that is widely understood by data scientists, and supported by research, but up until now only the biggest and technically superior companies like Google or Pinterest could take advantage of this difference. Liberty and his team created Pinecone to put that kind of technology in reach of any company.

The startup spent the last couple of years building the solution, which consists of three main components. The main piece is a vector engine to convert the data into this machine-learning ingestible format. Liberty says that this is the piece of technology that contains all the data structures and algorithms that allow them to index very large amounts of high dimensional vector data, and search through it in an efficient and accurate way.

The second is a cloud hosted system to apply all of that converted data to the machine learning model, while handling things like index lookups along with the pre- and post-processing — everything a data science team needs to run a machine learning project at scale with very large workloads and throughputs. Finally, there is a management layer to track all of this and manage data transfer between source locations.

One classic example Liberty uses is an eCommerce recommendation engine. While this has been a standard part of online selling for years, he believes using a vectorized data approach will result in much more accurate recommendations and he says the data science research data bears him out.

“It used to be that deploying [something like a recommendation engine] was actually incredibly complex, and […] if you have access to a production grade database, 90% of the difficulty and heavy lifting in creating those solutions goes away, and that’s why we’re building this. We believe it’s the new standard,” he said.

The company currently has 10 people including the founders, but the plan is to double or even triple that number, depending on how the year goes. As he builds his company as an immigrant founder — Liberty is from Israel — he says that diversity is top of mind. He adds that it’s something he worked hard on at his previous positions at Yahoo and Amazon as he was building his teams at those two organizations. One way he is doing that is in the recruitment process. “We have instructed our recruiters to be proactive [in finding more diverse applicants], making sure they don’t miss out on great candidates, and that they bring us a diverse set of candidates,” he said.

Looking ahead to post-pandemic, Liberty says he is a bit more traditional in terms of office versus home, and that he hopes to have more in-person interactions. “Maybe I’m old fashioned but I like offices and I like people and I like to see who I work with and hang out with them and laugh and enjoy each other’s company, and so I’m not jumping on the bandwagon of ‘let’s all be remote and work from home’.”


By Ron Miller