Continue from -
'Measuring Data Similarity or Dissimilarity #1'
'Measuring Data Similarity or Dissimilarity #2',
3. For Ordinal Attributes:
Ordinal attribute is an attribute with possible values that have a meaningful order or ranking among them but the magnitude between successive values is not known. Ordinal values are same as Categorical Values but with the Order.Such as, For "Performance" columns Values are - Best, Better, Good, Average, Below Average, Bad
These values are Categorical values with order or rank so called Ordinal Values. Ordinal attributes can also be derived from discretization of numeric attributes by splitting the value range into finite number of ordered categories.
We assign rank to these categories to calculate the similarity or dissimilarity, i.e. - There is an attribute f having N possible state can have `1, 2, 3........f_N` ranking.
How to Calculate Similarity or Dissimilarity:
1, Assign the Rank `R_if`to each category of attribute f having N possible states.2. Normalize the Rank between [0.0, 1.0] so that each attribute have equal weight.
Can be calculated as
`R_in = \frac{R_if - 1}{N - 1}`
3. Now Similarity or Dissimilarity can be calculated with any distance measuring techniques. ( 'Measuring Data Similarity or Dissimilarity #2)
https://www.facebook.com/datastage4you
https://twitter.com/datagenx
https://plus.google.com/+AtulSingh0/posts
https://datagenx.slack.com/messages/datascience/