Q&A
EDITOR’S
In this decade, if there is one thing that
we are not running low on, it’s data.
Depending on how you look at it, having
large volumes of data at your disposal is a
gift or a curse. When you have abundant
data, there is less likelihood that your
interpretations are skewed because shortterm
seasonal anomalies are avoided.
However, the downside of large data sets is
maintaining them over a long period because
organisations do not have an infinite amount
of storage space to work with. Also, it may
not be required for organisations to store
and maintain historical data older than, say
RAKESH JAYAPRAKASH, PRODUCT
MANAGER, MANAGEENGINE
12 months, because it might not help them
make present-day decisions.
At ManageEngine, we help our
customers get the most out of their data by
recommending them to set up a mechanism,
which involves three key steps:
Defining goals
This is the foremost and the most important
step for any data analysis project.
Stakeholders must establish what they
wish to accomplish and the KPIs to track in
order to help them make strategic business
decisions. Though this may sound complex,
necessary KPIs can be easily established
by listing questions the business wants
answered. These can be questions such as,
‘What products should I spend my marketing
budget on during the holiday season?’ and
We are in an era
where companies must
evolve their market
approach so fast that
the data used for
decision making six or
12 months ago would
not be relevant for the
present day.
‘What demographic do our products appeal
to, the most?’
Establishing clear goals can help narrow
down the type and source of data that would
answer these questions directly. It can also
help cut down the volume of data that needs
to be stored and analysed by a factor of
about 40 to 60%.
Building an automated data pipeline
Once the data analysis goals are established,
organisations would have a fair idea of
which data to use and which to discard. This
learning should be fed into data pipeline and
ETL (extract, transform, load) tools which, in
addition to gathering data from a variety of
sources and cleansing them, can also discard
data that is irrelevant for analysis. Doing
this will ensure that data noise is reduced
at the source rather than having to discard
unwanted data once it has reached data
warehouses or analytics applications.
Archiving historical data
We are in an era where companies must
evolve their market approach so fast that
the data used for decision making six or 12
months ago would not be relevant for the
present day. Businesses must rapidly adapt
to changing conditions to ensure continuity
and growth, so their decisions must be
based on the most current and relevant
data. Businesses must determine the
relevance of historical data and establish a
baseline that indicates how far back in time
the historical data should go. This will help
them achieve or discard old data that is no
longer of any use.
50