Data Clutter

2014/05/16: “A+ Decisions with Big ‘D’ Data”. Top-down companies can’t do analytics unless the Strategy team is at the C-level. Smarter Transportation: move from real-time data to predictive; sharing data, open (public) data – especially across municipalities. Lifecycle of Decision-Making Inputs:

Right data
Objectives – why analyzing
How present to stakeholders
Measuring outcomes – refine the inputs and how present them

Submit Interop abstract based on JFB’s Smarter Cities presentations.

2014/04/22: From an Estuate Webinar blurb: “Most customers of Oracle applications (like Oracle EBS, Siebel, PeopleSoft, JD Edwards, and custom applications) have accumulated many years of data in their production databases, most of which is rarely accessed. Such data acts like “data cholesterol,” slowing system performance and reducing user productivity. You can eliminate data cholesterol by archiving unused data, and still have it readily available for business purposes.”

2014/04/03: Time cards, procurements, and expense reports are typical sources of corporate data clutter. JFB had a big expense report held up for a few weeks because he needed to report the hotel VAT in the room rate instead of the room tax. How much did that cost in terms of both his time and AP time? And what if he’d had to pay off the credit card and then deal with getting reimbursement? PEG talked about VZ requiring receipts for any expense US$25 or over – even for an air travel e-ticket complying with VZ’s rules for air travel booked via AMEX Travel and charged to a corporate AMEX card. And then even though the (very user-unfriendly SAP) expense report application requires associating applicable AMEX card expenses – such as the air travel booked by AMEX Travel – with an expense report, they also still require the expense report submitter to, through the application, print the fax cover sheets and then fax the air travel (and other) receipts with the cover sheets to VZ Accounts Payable. This requires the submitter to first print the email from AMEX Travel containing the charges for the air travel. (The email also contains applicable predicted charges for the rental car and hotel but those charges don’t get processed until the rental car has been returned or hotel stay has been completed – at which point the traveler is provided a receipt directly from the provider.) IBM at least doesn’t require receipts for expenses under US$75 (the IRS limit), air travel charged on the corporate AMEX card, and/or lodging with an approved provider charged to the corporate AMEX card. Also, the fax cover sheet is a single sheet and the expense report application is much easier to use. (The IBM application could be better but it’s simple compared to VZ’s.)

The scientific method is not often followed in business – most organizations implement blanket rules instead of first experimenting and testing with a small, representative portion of the organization. This is especially the case for the typical internal IT Department – which operates as a cost center rather than as a P&L-driven vendor (to the rest of the organization).

Spot pricing – need to educate the buyers and need to reassure them that it won’t be discriminatory.

2014/03/19: The 2014 March 17 Good Reads section of the Christian Science Monitor included the blurb “Tech management tips” which paraphrases Clay Shirky’s comments in Foreign Affairs as saying that US government projects such as HealthCare.gov have floundered because they haven’t used agile and test-driven development. His points that we need to divide a large project into chunks that can be…

2014/03/18: Alain De Botton, in a City Arts and Lectures talk on 2014 March 5 (which aired on KQED Radio on 3/19 at 2am), stated that information has to be presented in a form that catches our interest. Raw data is not impactful; it’s art that gets us to care. About 35 minutes into the show he said the news is serving us mere raw data that by itself cannot penetrate our defenses. We have so much news, what criteria should we use to pick what’s going to fill the paper? We have to ask, “What is the purpose of the news”? The usual answer, after some thought, is “to hold the powerful to account”. Botton’s view is that “…news is information that should help the individual and the nation to flourish…” We need to ask, “What is the stuff that we’re gathering amount to?” “What is the guiding principle by which we’re trying to get information through to people?”

2014/03/13: Difference between Big Data and Analytics is the difference between collecting and storing a lot of data and extracting value out of data. Optimally Analytics should narrow down both the data that needs to be collected and the amount that subsequently gets stored. Further it should dictate how long, in what form, and where the data is stored as well as the structure for managing access to that data. ‘where’ includes number of copies and the authoritative source; ‘access’ includes tagging and indexing and proactive notification of new high-value data. (An encyclopedia by default is passive – you have to look up information. Wikipedia is reactive when it notifies you to a change to an article that you’ve tagged for notification. Wikipedia would be proactive if it solicited new articles based on your queries – and not necessarily just the explicit queries but rather a chain of related topics.)

Express data clutter in terms of volumes of paper. For example, a moving van full of magazines is the equivalent of the electronic newsletters and whitepapers sitting in your inbox and hard drive. Perhaps the proportion of each type of media in the equation to calculate the paper equivalence of electronic data has changed in the last decade. For example, photos and videos take more storage space relative to their paper equivalents. (Is the paper equivalent of a video the transcript of the audio plus one page per frame of the video – where a one-minute video at 30 frames per second would be 1,800 pages – or should the video be measured in terms of a strip of movie film measured in frames?) Can think about saving data in terms of saving tax records: once you’ve done your annual taxes (analyzed your financial data) you should get rid of all the past year’s paperwork not needed for your taxes (or for some other purpose) and the eight*-year old archived paperwork (again, excepting the records needed for some other purpose). (*US IRS rules) Likewise you don’t need to keep all of the source data for data analysis.

We share data in the library but we’re not really sharing data in Facebook in that we each have our own Facebook page and timeline. Facebook tightly controls access to the shared, encyclopedic view since that’s a large part of how they monetize the service.

Data diarrhea vs. data vomiting vs. data yawning

To get a reasonable response rate at a reasonable cost for a survey of everyone in a large population you need to make the survey quick and easy to take and the survey respondents need to be people that you have access to and can motivate to respond. In order to test the validity of the responses you need to test the consistency. For a very homogeneous population motivated to respond accurately, the average, mean, and standard deviation are sufficient. Otherwise, you need to segment the population and then look at the consistency within each segment (or a sampling of segments). If you cannot get to a reasonable consistency at a reasonable level of segmentation (which in part is a level at which the number of responses in a segment is still statistically significant), then you need to redo some or all of the survey, enrich the data, discard the responses to one or more of the survey questions, or discard or discount the analysis of some or all of the segments. And even this analysis of the validity of the data is insufficient in that you cannot tell if the responses of the people who responded are consistent with those who did not and/or if the inaccuracies in the responses were proportionally evenly distributed. Is it really the “will of the people” (i.e. are the voting results accurate) when 99% of eligible voters voted… but only because they were required to do so by law and the government’s choice was the only one? Consistent accuracy across various types of respondents to a survey of men’s knowledge of their life partners’ clothing sizes? (Gay men are more likely to accurately estimate their (male) partners’ sizes.)

For example, based on US census data alone you cannot determine the undercount in homeless and households with undocumented immigrants. Because those two population segments often live in large cities and are often in need of assistance (i.e. are disproportionally represented in large, urban areas and in the lowest income brackets); federal funding for services that is divided between states based on their relative share of the nation’s needy is unfairly distributed. As a side effect, though many services can be more efficiently delivered in urban than in suburban or rural areas, states are not incented by the funding model to support the urban poor. Another possible side effect is a funding mismatch when a service budget is based on census data but is reimbursed on the basis of people served – the number of student-hours or emergency room visits, for example.

compare a sampling of responses within each segment. If the sampling within a segment is very consistent then compare that segment with others. If not, then from sample

Scorecards of metrics that compare alternatives based on relative measurements vs. absolute scores. Ranking vs. tiering of first pass scores. Averaging all the constituent metrics vs. (weighted) averaging top-level metrics. Difficulty of determining the financials for a privately held company or for a product or product line of a large company; ease of determining primary value proposition presented to potential buyers.

2014/03/12: Does too much data ingestion lead to indigestion?

In Smartypants (Pete In School), Pete, the extraordinarily omnivorous hound introduced in What Pete Ate From A to Z, eats the 26-volume Encyclopedia About Everything by Zelda Peabody. For the next 24 hours he is incredibly erudite (reciting from Act III of Gertrude Stein’s libretto for Four Saints in Three Acts) and proffers insights on topics ranging from gravity to daydreaming to crying. But then, having fully digested the encyclopedia, he reverts to typical canine language and concerns. What lessons does this offer the data cluttered? How fleeting is the value we extract from the data we collect? Is there value in the (lasting) insights realized by others even when we, the data collectors, merely process the information?

2014/03/11: “digitally obese”: “one gigabyte (1,073,741,824 bytes) was the equivalent of a pick-up truck filled with paper” per Cal Tech professor Roy Williams in a BBC article title “Britons growing ‘digitally obese’” dated 2004 December 9. Also see ‘digitaldietguy.com’ and ‘fattyfinger.com’.

Clutter Enablers

The October 2013 issue of CRN magazine arrived with the following cover band and interior full-page advertisement.

The small text on the full page advertisement reads:

“Get your clients ready with the storage that’s ready for the future. The world of data is getting bigger by the day. That’s why for 25 years, SanDisk has pushed the boundaries of what’s possible in storage. The result is more than leading-edge storage. It’s peace of mind, even in the most challenging environments. sandisk.com”

This is akin to addressing a full closet, attic, and garage with a storage shed in the front yard. We need to reduce the amount that we need to store – electronic or otherwise – rather than increasing the amount of storage. With a flat hierarchy we do not make effective and efficient use of the material stored – especially when there’s more than one storage location and when the volume stored is more than can be simultaneously visualized. Unless the various physical storage spaces are logically represented as a single, raw data storage space, knowing that a particular set of data exists on the storage is a long ways from being able to efficiently extract it from the various applications and devices and then aggregating it back into one set of data. Further, there may not be space on any one logical storage device for the aggregated, enriched data set to be stored as a single entity.

Let Go of Clutter

The following material is adapted from Let Go of Clutter by Harriet Schechter. (To complement the book and her decluttering consulting business, she maintains a companion book site and The Miracle Worker Organizing site.)

Her terms: overwhelm, ‘clutter mutter’, ‘declutter procrastinitis’, information anxiety, ‘illegal pad’, speed-feed, Sidetracking Syndrome, subscribitis,

p. 8 ‘The “Get Over It” Worksheet’: What data do you regret losing? What was the impact of that loss?… What data do you regret not collecting? What was the impact of not having that data? (Column headers in original from left to right: ‘Items I’ve regretted getting rid of’, ‘Why?’, ‘What happened as a result?’, and ‘Was the result positive, negative, or neutral?’.)

Perhaps we need to merge or reconciliate legal data conservation guidelines with a more general approach to valuing data. There could be an index based on (1) how hard it was to obtain the data in the first place and (2) how hard it would be to do it again. Sometimes the process of obtaining the data would be enough (ie. don’t save the web page, just save the exact google search terms you used), while other times a bookmark or a pointer to original data would suffice. For original work, this might not work as well…. Either way, the process becomes a “hash” for the data.

p. 15 Glossary: Cleaning, neatening, organizing, and decluttering

p. 20 ‘Time = $ Worksheet’: subtitled “Can You Really Afford Your Clutter?” multiplies your hourly earnings by your time spent on clutter-induced activities to calculate your annual clutter cost. Wasted time activities listed are as follows. [We need to come up with the data clutter equivalents.]

Locating papers?
Looking for misplaced items?
Being annoyed because you can’t find things?
Duplicating efforts?

Classify / categorize clutter by type and location. [We need to provide data clutter category equivalents.] Rate the categories: rate which are the worst clutter offenders.

I think that could easily be done simply by building an index of repeated data. Imagine scouring through everyones hard drives to fiund out how many versions of the same or similar charts exist… We might be able to come up with a “clutter index” based on replicated data. At that point we could extrapolate the amount of space used and time used to modify branches of presentations for instance.

One thought comes to might — is there a way to fingerprint poewrpoint slides so that we can track their DNA, the same way we can recreate a genealogy tree with DNA — find out what came first, what was developed later, who branched off in differrent directions, etc. With some of the PPT meta data and file system data this could be doable.

p. 35 – 39 Housekeeping task checklist (space and stuff maintenance in order to control clutter)

p. 57 ‘The Five Types of [Paper] Piles’: growing, stagnating, diminishing, distilled, and the double-distilled pile. ‘pile germs’ go viral when they hit the distilled pile and will cause the pile to grow rapidly if left unchecked.

p. 58 Information Anxiety is the “unrealistic desire to keep up with the massive doses of information we’re all incessantly bombarded with nowadays”… “we can’t process it [information] any more quickly than our primitive ancestors could have”. [I disagree with her last point – but I do agree that the increase of our human information processing capacity has been outpaced by the increase in [of?] information to be processed. Further, I believe that without data cluttering the latter will continue to outpace even computer-augmented information processing.]

And dealing with data sometimes makes us impervious to emerging problems because w’re too focused on digging up piles of past information, which don’t usually showcase the upcoming black swan events. It’s as if the mountains of data hide the important bits of information that we really need to focus on.

Random thought: WWZ has a line about the 10th man — if the first nine agree, the last must disagree in order for a debate to happen. Sam with data — if there is too much agreement in the data, we need to focus on what is not said or not found.

p. 63 She provides Winnie-the-Pooh’s definition of organizing “Organizing is what you do before you do it so that when you do it, it’s not all mixed up.”

p. 66 “File Index Sample” top level categories are:

Financial – including sub-categories for Insurance, Receipts, and Taxes
Household – including sub-categories for Gardening and Instructions / Warranties
Personal – including sub-categories for Correspondence and Resumes / Career Info
General – including sub-categories for Emergency, Recipes, and Travel

p. 78 – 79 ‘illegal pads’ are pads of paper with only the top couple of pages used. ‘speed-feed’ is the process of two people tackling a single-category pile of paper (sorted during a ‘speed-sort’): the assistant hands the decision-maker a piece of paper while announcing its notable characteristics, gives her a few seconds to decide the paper’s disposition, prods her to make the decision, then hands her the next item.

p. 79 – 80 Paper disposition choices are: discard, file, act on (in a timely fashion after completing a decluttering session), and forward. Use “The Five Ws of Clutter Control” questions listed below to help quickly make the disposition decision. (Once you make a decision, stop asking questions and move on to the next item.)

What is this?
Why would I want to keep it?
When would I ever need it?
Where would I look for it?
Who else might have it?

p. 81 – 82 The five steps to curing subscribitis are:

Accepting – it’s impossible to read everything that you “should” read
Evaluating – make a list of all your subscriptions and on a scale of 1 to 3 rate how often in a timely fashion you thoroughly read each
Editing – determine a manageable number of subscriptions and use the ratings to pare your list to that number
Divesting – unsubscribe or at least direct the remainder of your subscription to a worthy health and human services non-profit [Don’t just switch from a paper to an electronic subscription! Typically you’re just as unlikely to read the electronic version magazine.]
Maintaining – determine your weekly “reading maintenance” time [time to read everything to which you have subscribed rather than the time to maintain your subscriptions] and keep the list to that number

p. 85 Prevent ‘paper hangover’ by keeping your paper decluttering sessions to a reasonable length.

p. 89 (Chapter 5) Three types of stuff: passive (supplies), active (errands), and perpetual (clothing and household).

p. 114 “Change ‘someday’ into Sunday” – part of the space decluttering action plan.

Chapter 6 Sentimental Clutter and memorabiliacs: Personal memorabilia only has value to the person whose memories the items represent. Categorize personal memorabilia as Happy, Sad, Good, or Bad; then keep the most special items from the Happy and Sad categories and get rid of the less special items in Happy and Sad and all the items in the Good and Bad categories. Use [theoretical] ‘shockers’ – moving, disaster (e.g. fire), or death – to help differentiate sentiments. Create memory boxes for saved items that cannot be displayed. Periodically view the contents (in a reminiscence ritual, for example) and weed out items that have been superseded in sentimental value by the latest candidates for addition to the boxes.

Maybe this is a step back but is there a way to put a step in between – ie transfer physical clutter to digital; once it’s digital it can be tagged and disposed of automatically unless saved. Also you can track how often you actually get to it.

“when you don’t limit clutter, clutter limits you”

p. 140 References David Shenk’s Data Smog when talking about Clutter Victims.

p. 153 In a “Defining Elements Worksheet” list the defining elements for each type of item for which you want to make better future acquisition decisions. For example, if in your closet clutter cave you found a lot of Good stuff (nice clothing that you never wear), then you should compare it to the clothing that you do regularly wear and identify the characteristics of what works for you. [If you’ve been known for somewhat dubious style choices, then you may want a fashion maven friend to further winnow down the characteristics to what really works for you.] For example, if you never wear pants or skirts without pockets, then pockets is a defining element. [Unless you’re really well-trained, never wear an outfit with pockets when speaking in public – otherwise the tendency is to put your hands in the pockets which ruins your non-verbal communication and the line of your outfit.]

p. 157 (Chapter 8) The two main sources of mental clutter are stuff to remember and stuff to forget.

p. 162 The “Master List Form” headers are: Calls, Projects / To Do, To Order / To Obtain, Correspondence / To Send, Errands / To Go, and Miscellaneous. Break down each larger item into a step-by-step list [a project plan]. [I haven’t included much more on the list commentary since as a person raised creating lists it was obvious to me.]

Data Declutter

The following thoughts on data clutter started as musings while reading Let Go of Clutter.

Analogies and Tales

Goldy-bucks and the Three Business Analysts

Papa’s reports were too big and too complex (all quantitative analysis)
Mama’s reports were too soft and squishy (all qualitative analysis)
Baby’s reports were just right – the right size and depth with a bit of dreaming (intuition in addition to the quantitative and qualitative analyses) – and she’s still reading

Case Studies

Pre-9/11 had the data to prevent the attacks – but had too much data obfuscating what was significant.

I think there is a broader problem which is that data can’t tell us about new things or different developments — it’s better at explaining past trends that may continue into the future. But what we should be getting better at is identifying when things are going to change — and data clutter makes it worse. I think there is a legend/rumour about the fact the the thickest annual reports are issued by companies that do less well.

Terminology

‘data clutter’ – data clutter for you is that which keeps you from making effective and efficient decisions

‘data overwhelm’ – sit and spin; produces ‘clutter mutter’; related to analysis paralysis

‘hoard’ (not ‘horde’) is pejorative but ‘archive’ is not – even though the action is often the same. You want to ensure that your archiving does not become hoarding.

Data equivalent to p. 15 Glossary:

data hygiene, virus scanning, reformatting, scrubbing, harvesting (cleaning)
compressing, defragmenting, compacting, archiving (neatening)
organizing, content management (organizing)
decluttering (decluttering)

Terminology Questions

What’s the difference between:

‘classify’ and ‘categorize’
‘attribute’ and ‘characteristic’ (and ‘element’ – as in ‘defining elements’)
‘data’ and ‘information’
‘taxonomy’, ‘ontology’, ‘glossary’, ‘framework’, ‘structure’, ‘index’, ‘dictionary’

to that we should probably add “tagging”

Questions

What type of data is not electronic?
How many categories for filing data (vs. categories / classifications for filing paper clutter) is optimal? (See ‘Sound Bites’ section.)
What are the electronic equivalents of an ‘illegal pad’? An email with embedded graphics or attachments? A PowerPoint with unused slide masters? Uncompressed or fragmented data? Untagged data? A file that includes dictionary and/or libraries redundant to the operating system’s?

I don’t know if there is an equivalent since data can be crawled and indexed — although that is perhaps the analogy: illegal data is not indexed or searchabled. Like an intranet!

What is the opportunity cost of data clutter and the opportunity cost of acquiring another data flow?
Is there a data equivalent of “Good Stuff” – the unfinished projects (active stuff) and the supplies that “might be useful someday” (passive stuff)? Engagement deliverables that need updating and scrubbing before reuse? Performance measurements not used by executives?

Here I would get back to process: rather than focus on what the team produced during an engagement (intellectual capital), maybe we need to refocus on process (how was it identified), codification (how was the intellectual reasoning made), people (who was invovled an why were they the right or wrong people), etc… Maybe our focus on outputs is completely wrong-headed…?

Is there a data equivalent of the garage / attic / basement / barn?
Is there a data equivalent of supplies (e.g. empty gift boxes), a type of passive clutter? Templates, forms, redundant copies, one-offs (minor variations)?
What is the data equivalent of ‘clutter caves’ – cabinets and other locations that attract passive stuff collections?
What is the data equivalent to personal memorabilia (and sentimental clutter)?
Is it creating more data clutter to collect data on how well you’re doing at reducing data clutter? No, not if it’s a measurement that gives you confidence in your decisions to say no to new data clutter and to get rid of existing data clutter; not if it enables trust in your ability to make good decisions.
What data needs to be tracked for fully depreciated assets?
How many filing cabinets do you need in your primary office space if most of your data is electronic? (David Allen in Getting Things Done says that most people [in the paper world of past decades] need at least four file drawers – and hence that the office hoteling concept won’t work.

(Added 2013/05/27)

What drives printing (conversion of electronic data into a paper format)?
What is the amount of data duplication?
How close are we to physical interaction with data models?
Which features of Google Docs are attractive for collaboration? And which features are missing or don’t work? Ditto for Lotus Connections. Which other collaboration tools should we be looking at?

I think collaborative editing (happens in Docs but not in Office) and the ability to automatically find source pages/charts without you having to rebuild them are key.

On the other hand if we stick with the previous thoughts on process, maybe we should toss charts and think about how those were created — make that process faster and the thinking/message behind the charts clearer.

Sound Bites

Just because you collected the data doesn’t mean that you need to keep it. Yes, you need historical data for trend analysis – but “incremental analyses” in a disruptive situation isn’t very useful.

Picture what your business cases, desktop (physical and electronic), mailbox, and so on will look like without the clutter. Set incentives and deadlines (with mini-goals) for performing the decluttering tasks – and leverage social motivation to get them done. Bring in a third-party with declutter expertise. Pain motivation: what was the cost (pain) of decisions made in a state of data overwhelm?

If paper is the biggest form of physical clutter (in part because a pile of paper clutter is relatively very dense), then is email the biggest form of data clutter (at least for an individual)? Tweets? Facebook posts? Maybe one way to reduce would be to have all email that requires a written response to go through a collaboration service.

For each type of data, make clear policies about what information employees should or should not keep, and for how long and where they should keep it. With data there is version control, change control, and authoritative source issues beyond the data and source denotation of paper documents (per p. 60).

Freedom from clutter requires maintenance – though in the case of (electronic) data a lot of that maintenance can be automated.

Categories for filing data translates into taxonomies and table of contents. (See ‘Questions’ section.) The four to eight categories recommended for paper (p. 63) sounds like way too few – unless you’re really only categorizing by urgency of action on the piece of data rather than by the characteristics of the data. You could also keep down the number of categories by aligning the top level with business units… or maybe with the modern replacement for Porter’s 5 Forces? Or maybe with the factors of production? Plan your content management and taxonomy before you start dumping stuff into (virtual) folders.

Make filing, tagging, commenting (collaboration), version control, change management, linking, harvesting, … easy. Make a widget a la the ‘Press This’ (WordPress) and ‘Pin It’ (Pinterest) widgets. Develop an electronic Post-It note application which allows comments on files to be created externally to the applications (e.g. Microsoft Word, Microsoft PowerPoint, and Lotus Notes) used to create and view those files, and stored with the files to which they are affixed. This is in contrast to the comment and markup functions of those applications: those functions can only be accessed from within a particular application and used against a single file; a single Post-It note can be affixed to multiple files created in disparate applications.

I cut down on illegal pads by making my own pads by tearing 8.5×11″ sheets of paper with at least half of one side unused in half across the short axis and clipping the blank sheets together with a small binder clip. So though I’m guilty of having pages of notes that I haven’t gotten around to typing in so that they can be shared and searched, I’m not constantly reaching for a new pad. When I’m done taking notes on a meeting or a book, I open the clip and remove those pages. Sometimes I staple or clip together the removed pages but since I date, number, and title the top of each page that’s not really necessary. When I have new blank pages, I open the clip and insert.

Speed-sorting… this can be automated if the data is electronic or if the data’s attributes / characteristics are electronically tagged. In parallel to answering one the “The Five Ws of Clutter Control” questions on p. 79 – 80, due clutter control tagging.

Mechanisms such as summary services (of books, articles, the news, etc), headline feeds, tweets, listening to the radio or podcasts while doing menial tasks (cooking, cleaning, driving, gardening), and briefings via email can all augment or take the place of paper subscriptions. Of course they too can become clutter; don’t subscribe to the podcasts if you’re not going to have time to listen regularly. Items with an expiration date such as library books help combat clutter.

Business equivalent of subscribitis: at the business level need to “just say no” to more data – unless you’re already well handling the data flowing in and the data stored and/or unless you get rid of at least an equal volume of data (that’s less valuable to you than the new data).

In the book she comments that snail mail is still a major source of paper clutter; she must not be aware of mechanisms to drastically reduce your mail volume. For example, you can sign up with Green Dimes, cancel catalogs (convert to email notifications if necessary), switch to electronic statements, and switch to autopay. We need a Green Dimes and Do Not Call list for email.

With electronic data it’s not so much about the physical storage space that it takes but rather the volume of data and the complexity of the database – number of variables, types of data, discrete / independent data sets, number of … Consider whether you want to use all of your data processing capacity on supplies; if not, then determine what portion is appropriate.

Need procedures for dealing with active stuff – the equivalent of an Errands System’s designated area (for storing / staging active stuff) and time limits (for regularly emptying the designated area).

Don’t keep data just because it might be useful “someday” – that’s like keeping clothes that might fit someday. Ask yourself “if I freed up data processing capacity, how would I like to be using it?”. (Freed up capacity does not necessarily have to be used to process other data; it could be used for creativity, intuition, invention, and innovation. Or maybe for executing and/or managing decisions. Or for improving data collection and/or hygiene.

Don’t subscribe just because it’s free (and online). Don’t link, friend, follow, … just because you’re prompted / asked.

Understanding the “defining elements” of the data (and of its associated processes) will enable you to make better data decisions – and to have higher confidence in those decisions. (And to lower the risk of bad outcomes?) … Enables pickiness – filtering for high value.

Need to come with data-related examples for the “Defining Elements Worksheet” (p. 153). To quote TLo, “Girl, That’s Not Your Dress!”.

Perpetual data example: hours worked by non-exempt employees. macys.com example of data clutter.

Stuffing yourself full of data is no less (maybe even more) wasteful than discarding it. Analogy: “clean your plate” is not the way to go; it’s better to compost the excess directly. Hoarding and clutter are unchristian.

Reduce mental clutter by reducing data; by making it easy to (electronically) annotate (electronic) documents / … with your thoughts.

Electronic data organization shouldn’t merely mirror physical data organization – static set of folders in each person’s cabinet with items often duplicated across cabinets but rarely across folders within a cabinet – but instead allow for a single authoritative source for each item, for each item to appear in every applicable data set, and for access and version controls. (Written by PEG in her review for Getting Things Done by David Allen in the HCM Books post.)

Confluence Wiki software allows for an excerpt defined on one page to be included on any number of other pages – enabling a single authoritative source of that content (the content may only be modified on the source page) and enabling that content to appear on several pages without having to copy the data.

Other random though that didn’t belong anywhere else in here: is our tendency to hoard a way for us to fight our inevitable death? We keep so something hangs around beyond our life, or to help make sure a semblance of our life is there after we’re going. Older people sometimes switch to a phase of “getting rid of stuff”. Imagine if you had 10,000 years to live — would you keep as much or would you not beother since you always had time to find it or recreate it again?

Also: data is different than physical clutter because the pace it takes up is almost invisible. Maybe we need to make hard drives inflate a proxy (a balloon outside a computer, a number outside a data centre, etc.) to illustrate the space taken up by the data. Petabytes and zettabytes are too big for comprehension — just like billions and trillions of dollars don’t make any sense to us in a meaningful way.

References

Biblical sayings about hoarding – especially Luke 12:16 – 21.

Books

Data Smog by David Shenk – published 1997 so it could use an update.

Let Go of Clutter by Harriet Schechter.

National Resource Center on AD/HD “A Guide to Organizing the
Home and Office” white paper.

Harriet Schechter’s recommendations in Let Go of Clutter.

Aslett, Don; Clutter’s Last Stand, 1985, Writer’s Digest Books.
Culp, Stephanie; How to Conquer Clutter, 1989, Writer’s Digest Books.
Felton, Sandra; When You Live with a Messie, 1994, Revell.
Felton, Sandra; The Messies Superguide, 1991, Revell.
Goldsmith, Olivia and Amy Fine Collins; Simple Isn’t Easy, 1995, Harper Paperbacks.
Kanarek, Lisa; Organizing Your Home Office for Success, 2nd ed. 1998, Blakely Press.
Morgenstern, Julie; Organizing from the Inside Out, 1998, Henry Holt and Co.
Schechter, Harriet; Conquering Chaos at Work: Strategies for Managing Disorganization and the People Who Cause It, 2000, Fireside/Simon & Schuster.
Schlenger, Sunny and Roberta Roesch; How to Be Organized in Spite of Yourself, 2nd ed. 1999, Signet.
Silver, Susan; Organized to Be Your Best!, 4th ed. 2000, Adams-Hall.

Recommendations from AD/HD document (not in above lists):

Eichermuller, P. (2002). (article) Taking control of your clutter. www.sunshineorganizing.com
Gracia, M. (2002). www.getorganizednow.com (forum)
Hall, J. (2002). www.overhall.com
Kolberg, J, Nadeau, K. (2002). ADD-friendly ways to organize your life. New York: Brunner-Routledge
Moulding, C. (2002). Ten ideas for quick clutter control. Get Organized Now Newsletter

Winston, S. (1995). Stephanie Winston’s best organizing tips. New York: Simon & Schuster

Organizations and Services

Association of Records Managers and Administrators – note the Web seminars on topics such as “Big Data -Strategic Asset or Corporate Risk & Burden?” and “Auto-classification: Common Myths Debunked“.
Information Requirements Clearinghouse – note their Retention Manager and Retention Wizard software products.
Harriet Schechter’s sites for The Miracle Worker and Let Go of Clutter.
Messies Anonymous – Sandra Felton’s site.

Related Sites

List Organizer – subscribe (!) for paper list templates.
Checklists site – a clutter of checklists.
Oraganize-it – devices to get more organized (no software).
FlyLady – housecleaning advice.

The Data Lifecycle page: meeting notes
The Clutter Buster Inventions page: data decluttering inventions
The Data Clutter page: notes from the Let Go of Clutter book, data declutter musings, and related resources
The Liberate Your Data page: data declutter abstracts and deliverable details

Emergent behavior

Is there away to use Emergent behavior to declutter?

http://www.thepangburns.com/jesse/projects/ant_simulation.php

http://www.amazon.com/Emergence-Connected-Brains-Cities-Software/dp/0684868768

http://books.google.ca/books?hl=fr&lr=&id=Au_tLkCwExQC&oi=fnd&pg=PA11 amp;dq=EMERGENCE+The+connected+lives+of+ants,+brains,+cities,+and+software&ots=GkqxGiNVra&sig=dUZGpT0qQrtMegtcREIqvZSrWis#v=onepage&q=EMERGENCE%20The%20connected%20lives%20of%20ants%2C%20brains%2C%20cities%2C%20and%20software&f=false