data mapping

Healthcare IT Interoperability Nurses

When exchanging data, one has to map the first databases data elements to the elements of the table of the destination database. Sounds simple doesn't it? Just get the data file, map it to where it is going, and send it over. Translate the "male" to "M" and the "common cold' to "Rhinitis". But real life is not that simple. There is a problem inherent in mappings. Databases from different organizations commonly vary in the granularity of the concepts. If two terminologies have different granularities, that means they have different levels of detail for similar concepts, hence the concepts will not map one to one. For instance, lets say that you store your basketballs, baseballs, golf balls, and marbles in separate boxes. I have that stuff too. But I store my softballs and hardballs data separately. All the rest of my boxes are the same so I have one more box than you. When I tell you I have 30 hardballs and 20 softballs, you have to know to combine them in your system into 50 baseballs. Fine. But when the information is sent back to me, you have lost my distinction. "50 of what?" I say, "Is that hardballs or softballs?" You don't know. The data cannot be reconstituted. It is not like a uniform condensed soup.

Your mission determines what you want to store in your databases. What if I don't have any volley balls, so when you tell me you have two volley balls, I don't know what to do with that information. What's more, I simply don't care. So I store it as "Balls - Other". OK fine. Next, I hand you data about badminton birdies - It's not a ball at all, you say. You don't play badminton, so you just call it "other sports equipment". Once again, when we need to send the data back to its source, the volley ball and birdie data has been lost. My birdies are now under "Other" when I get them back from you. You don't even get any volleyball information back. There is a loss of meaning when going to the less specific or nonmatching concept and back again. So it's not so easy to exchange data meaningfully. I have syntactic understanding (it's in the fifth byte and it is 20 characters long) but not exact semantic understanding. I have a bunch of "Other balls", zero softballs, and zero hardballs - not too helpful if your life depends on it. What does "Other" mean now?

If we transmit medical data around enough, will we have everyone diagnosed as "Other" taking that "other" prescription? No, it won't be that bad, but there will be problems. If the Federal government forces us to use values that are meaningful only for certain organizations, then standards will hurt us and not help us because of differing granularities. What is required in an incredible in depth understanding of the mission of every entity so valid detailed concepts can be established. The FBI's concept of the values for the field "sex" is very different from that of the Veteran's Health Administration, (VHA). The FBI has a purpose for the data. They need to identify people. Hence, for the FBI, the data standard needs a sexual appearance concept. The FBI has 23 values for the data field they call sex. All the values are based on what they look like.

The VHA has a purpose for the data field they call sex. They need to treat people medically and be able to assign beds to patients. They don't want a man named Pat in the maternity ward. For the VHA, the data standard needs an administrative sex concept. When one works at the data concept level, the data field contents make sense in the context of their purpose. Then the FBI can have their "female appearing as male" value and the VHA can have their "administrative male gender" value. That's why both the VHA and the FBI can benefit from participating in on going standards development work. When IT people understand how an organization carries out its mission, they can define the concepts to a very fine level. Once we reach this very fine level of definition, then we can send messages with out loss of meaning. We now know that in the data standard we are developing, we need both a sexual appearance field and a medical gender field. Then the data fields will contain clear meanings when they are received by yet another Federal organization like FEMA.






Data Mapping Problems in Data Exchanges Between Enterprises

When attempting to map and transmit data from one agency to another, if we transmit medical data with out resolution of semantic issues, we may lose the meaning of the data. If your enterprise and my enterprise store different types of details, I may be tempted to solve my problem by lumping your non-matching items into my "Other" category, just so I can view the majority of what you sent me. But when I send your data back to you, it won't look the same anymore to you. This is because we have different concepts of our data. We may have a different opinion about what details are important. When data field mappings can not be done with a one on one accuracy, the data concepts need to be examined closely so the data transmitted in messages can be understood and stored properly. This is why developing data standards is so crucial.

The Federal Health Information Model is an effort to coordinate data models for the FHA partner agencies. The FHIM model may be viewed at www.fhims.org. The FHIM is a computationally independent information model.


Our mission determines our data concepts

The data concepts of an organization come from its mission. The FBI data bases contain different information from the CDC, which in turn contains different data from FEMA. All are after a different functional result for a different purpose. But all may have data fields with the same names. It is WHY they have these data fields that matter. From the WHY, which is the purpose of the organization's mission, we derive a data concept. The data standard then becomes a catalog of data concepts. We can map concept to concept. The technology offered by XML enables us to label data fields with their standard concept names when they are transmitted between agencies. Ideally, a reference table of a data field's purpose should be maintained.

More information