What's 🔥 in Enterprise IT/VC #335
ChatGPT in the enterprise, peeling the 🧅 back - Microsoft Security CoPilot 🤯
Like all of you, I’ve been having a ton of fun with ChatGPT to help write and define job specs, summarize articles, and provide prompts for new articles. That being said, these are table stakes as every application touching an end user is adding this functionality to their apps. Why - because it just improves productivity assuming users understand that some definitive answers may not be correct 😃.
At the moment, 1/2 of the boldstart portfolio is working on layering some capabilities into existing solutions, and every week at least 1/3 of the pitches we see include some reference to LLM models. The market is moving so fast that IMO first mover advantage for brand new startups may not be an advantage but potentially a disadvantage. As each week unfolds more and more breakthroughs are unleashed, and until the dust settles its hard to tell who’s doing what and who will own what. I know for a fact that last week’s OpenAI announcement with the ChatGPT App Store killed many a startup idea and I expect more of these announcements upcoming.
However, what really excites me is digging a layer deeper and thinking through the second order effects on enterprise technology beyond code snippets for developers to security implications like vulnerability scanning, data privacy and security, and automated workflows. This week Microsoft released a preview of Microsoft Security Copilot 🤯, and it’s quite insane. If you haven’t seen this, check out the video below along with an expanded 3 minute one to show the capabilities.
Not only will it allow a security analyst to query using NLP but it can also pinpoint new threats helping find the signal from the noise, create automated workflows and runbooks to streamline incident response time, but also reverse engineer attacks and dig into malicious code to give analysts deep understanding of how to prevent future attacks. This is game changing for the security industry and Microsoft - it also states that "over time it will expand to a growing ecosystem of third-party products."
As of now, Microsoft states that Security Copilot is trained on (The New Stack):
…data from the Cybersecurity and Infrastructure Security Agency (CISA), the NIST vulnerability database, and, of course, Microsoft’s own threat intelligence database.
Microsoft Security Copilot aims to provide end-to-end defense at machine speed and scale. It integrates an LLM with a security-specific model from Microsoft, which incorporates a growing set of security skills and is informed by Microsoft’s global threat intelligence and more than 65 trillion daily signals. Running on Azure’s hyperscale infrastructure
I’m sure Crowdstrike, Palo Alto and others are not waiting to get integrated as a third party data endpoint into Microsoft Security Copilot and working on their own versions, trained on their own data, to own the ❤️ and minds of security analysts.
Other areas to think about include next-gen DLP technology (data loss prevention) to use it to auto categorize and label documents, data, text to perhaps block access on LLMs or to create rules around which data is allowed to be used to train models. What Databricks recently announced is also fascinating (see below) as it open sourced a LLM which can run on a single machine and can train a model on a small 50,000 word dataset for <3 hours on 1 machine (SiliconAngle).
Databricks said it was able to take the EleutherAI model and make it “highly approachable” simply by training it with a small, 50,000-word dataset, in less than three hours using a single machine. Despite the much smaller model — only 6 billion parameters versus ChatGPT’s 175 billion — as well as a smaller dataset and training time, Databricks said, Dolly still exhibits the same “magical human interaction ability” demonstrated by ChatGPT.
“This shows that the magic of instruction following does not lie in training models on gigantic datasets using massive hardware,” Databricks explained. “Rather, the magic lies in showing these powerful open-source models specific examples of how to talk to humans, something anybody can do for a hundred dollars using this small 50K dataset of Q&A examples.”
Also check out portfolio co Cape Privacy as it allows non-security focused developers to confidentially process sensitive data with vision, voice and language models.
Finally, how about protecting the security of the models themselves?
The future is super exciting and we’re just scratching the surface of what’s possible with new LLM models and please reach out if you are building new capabilities for the new world!
As always, 🙏🏼 for reading and please share with your friends and colleagues.
Scaling Startups
❤️ this Don Valentine (Sequoia founder story) - that was what it was like 15+ years ago, if you didn’t go to HBS or Stanford, well you were out of luck when raising a new fund
Don Valentine describes his meeting with Solomon Brothers while raising the $5M Sequoia I fund. It didn't go well. buff.ly/3B6ndH8 -- Just one fav excerpt from this 75-page interview with him.Thanks to Paul for finding this oldie…my post from 2005 was related to the Web 2.0 Bubble
Pretty impressive run for a cybersecurity newbie - power of perspective and outside thinking - I heard in Nikesh’s first 60 days all he did was meet the top 100 customers to learn
Enterprise Tech
YC database - here’s a quick way to know what VCs think are 🔥 at the moment - I did a quick search on the Winter ‘23 class and 10% of the batch companies mention LLM ( 27 out of 272), 15% are a developer tool (40 out of 272), and 13% are open source related (35/272)
More enterprise use cases - creating workflows and continuous loops based on output from a query and using it as input in next step
What’s needed for GPT in the enterprise - Aaron Kalb, co-founder of Alation, data catalog co (New Stack)
In this context, he said, taking something like ChatGPT from the public internet and bringing it into the enterprise is very risky. He thinks that data needs to be, well, more intelligent before it is used by AI systems within an enterprise.
Also, he doesn’t think that the “internet scale” of ChatGPT and similar systems is needed in the enterprise. This is where Alation’s “data catalog” comes into play, as it will “distill down” the data and give it “specific mapping.”
Every organization has its own terminology, he said — that could be industry terms, or things that are very specific to that company.
“So that’s where data intelligence and the data catalog helps,” Kalb explained. “It helps to map that last mile of how language is used by people in the organization, and how data is stored in the databases.”
Alation’s software automates the process of putting an organization’s data into these “data catalogs,” which can then optionally be fed into a generative AI system (if the company wants to do that).
Co-founder of Hashicorp diving deep into LLMs - read 🧵 as list of prompts and other great suggestions
What Goldman Sachs is doing with LLMs (Insider)
But with the potential has come some uncertainties around intellectual property, regulation, and privacy. Despite Goldman — along with Citibank and JPMorgan — blocking employees from accessing ChatGPT, the bank is still working with the tech.
Argenti noted the banks' blockage of ChatGPT is no different than the standard process of companies blocking completely unrestricted access to the internet on work devices. "There is safety and there is usefulness, and that intersection is where we need to navigate," Argenti said.
Argenti and Tsementzis outlined three ways Goldman is experimenting with large language models…
Summarizing and extracting data from documents
Goldman's document-management process stands to improve from the use of generative AI, Argenti said. Banks deal with countless legal documents, related to things like loans, mortgages, and derivatives. These unstructured documents, often written by lawyers, are extremely complex and aren't made ready to be put into a machine.
How do you measure developer relations success? From
Tracking the Fake GitHub Star Black Market with Dagster, dbt and BigQuery (Dagster) and as mentioned last week please don’t tell me about your Gitstar success as why you’re going to build a big business
Markets
Fascinating new approach for funding Sales and Marketing as companies scale