Thursday, June 10, 2010

Domains of Information

I have a hypothesis that each application/database and stand alone databases are "contained" information domains, and contain sets of data elements that define their context.
If my hypothesis is correct then each of these domains could be described with an ontology, where by the context and definition of the data sets would be shared by the community that is the primary user/consumer of the information within the database.

Each database within a company would have it's own primary ontology, which is nothing more than fully described database schema. Since the guiding rules of relational theory state that you should be at a 3rd Normal rationalization, then the primary ontology is created on this definition of the schema.

Taking this further we can envision that each database is its own domain with its own ontlogy, with the enterprise having sets of ontologies. I am using the term "sets" to mean a collection of ontologies that may be reflective of similar context within combined user communities of interest. This concept provides a method for describing and cataloging information sources across the enterprise. The individual information sources are defined within the context and rules of the primary ontology, meaning that the data within the information source is defined based on the primary community for which the database was developed for.

If we catalog all information sources within a company, and we have formal ontologies for each of these information sources, it becomes evident that we can evaluate the business value of these information sources based on the business value chains that are defined by the organization or company.

So have I sparked any thoughts or comments yet?

I will continue to build this idea in future posts, I would value your input...

Wednesday, June 9, 2010

What do people think information is?

Thought I would start a series that would begin with a question;
Is the data contained in a database finite, contained, and controlled?

If you assume that a relational database is supporting an application, and the application uses the database to store, retrieve, update, and query the application data.

Would you believe that the data is contained to the data definitions that are used by the application and it's users? Also would the application control the data definitions and not allow data that was not defined by the application? Would you also agree that data is finite within the boundaries of what the application has specified in its domain of understanding?

Can I make a statement about the applications database, the data within the database is contained within a domain that is defined by the applications semantic definition of data entities and tables, and the relationships between the data sets that make up the database.

In other words, the data within a database is composed of sets of data elements that are grouped together based on the definition of a vocabulary, (i.e. taxonomy) which is understandable by the application. And once defined, can only be changed if the application is changed to accept the new definition or new element.

Within an organization there may be many applications that have relational data sources which support the applications. If each of these applications was developed by and for different users, then the vocabulary of those users would be reflective of the data set definitions within those applications databases. So the semantic meaning of the data sets within these databases would only be relevant to those users who know and understand their vocabulary.

Should you force all users to accept only one vocabulary within a company or organization? Is this how an organization really executes, or is a company composed on many diverse communities that perform functions based on their communities vocabulary?

So just some random questions I will post more on this but I am looking for your reaction to the points I raised...