tgroenwals shared this post · Apr 22
Clare Kitching

Data isn't the hard part.
Understanding each other is.

Ontology. Lineage. Semantic layers. Vector databases.

I've been in data for over 15 years, and sometimes even I feel like I'm decoding a foreign language.
We've turned simple ideas into jargon that makes non-data people tune out.

Here's what these terms actually mean and why they matter for AI:

▶️ Ontology
A shared definition of your core business concepts and how they relate.
It gives AI clear concepts to reason about instead of guessing.

▶️ Entity
A real world thing like a customer, product or event.
It helps AI tell the difference between people, products and moments in time.

▶️ Metadata
Data that explains other data.
It tells AI what something means, how fresh it is and whether it can be trusted.

▶️ Physical layer
Where data is stored and processed.
It shapes how fast, scalable and reliable AI workloads can be.

▶️ Logical layer
How data is organised conceptually, not physically.
It shields AI from raw technical mess.

▶️ Semantic layer
A business friendly layer with agreed definitions and metrics.
It stops humans and AI arguing over what a number actually means.

▶️ Schema
The formal structure of what data exists and what type it is.
It gives consistency so AI knows what to expect.

▶️ Data modelling
How entities and their relationships are designed.
It reduces confusion in how AI interprets data.

▶️ Data virtualisation
Accessing data from many sources without copying it all.
It lets AI work across systems seamlessly.

▶️ Vector database
A database that searches by similarity, not exact matches.
It enables richer retrieval and context for AI.

▶️ Data pipeline
How data flows from creation to consumption.
It keeps AI fed with timely and relevant inputs.

▶️ Orchestration
Coordinating when and how pipelines run.
It keeps jobs reliable and in the right order.

▶️ Data quality
How accurate, complete and consistent the data is.
It directly affects confidence in AI outputs.

▶️ Observability
Seeing what data systems are doing and spotting issues early.
It helps catch drift and weird behaviour before damage is done.

▶️ Data lineage
Where data comes from, how it changes and where it’s used.
It adds transparency and explainability to AI decisions.

None of this is magic.
But together, it’s the foundation AI stands on.

What other terms would you add as essential?

♻️ Repost to help someone get their idea into action.
🔔 Follow Clare Kitching for insights on unlocking value with data & AI.
💎 Get more from me with my free newsletter here: https://lnkd.in/giQ3b6Fi

176
Rosemary Daly Lineage is the one I see fails most Clare - teams build beautiful semantic layers, then later people have trouble answering -where did this training sample actually come from - and the model’s trust story collapses. Your glossary is the vocabulary. Keeping it alive once the AI is in production is where the assurance work starts. Apr 22 2 likes
Lorphic Most AI failures aren’t intelligence problems, they’re translation problems between messy data and unclear meaning. Apr 22 2 likes