Collecting Data is the Most Abstract Thing You Can Do.

I borrowed that statement from Tim Allen, Professor Emeritus at the University of Wisconsin.  Tim is a colleague of long standing, and I think he’s one of the best thinkers in ecology.  His books on hierarchy and complexity repay detailed attention.  So why does he say that data collection is so abstract, and what does this have to do with urban long-term ecological research?
At first glance, collecting data seems like a pretty concrete activity, like going to work at the bank, or cutting the grass.  To collect data, you pull on your boots, spray tick repellent on your pants cuffs, and head out the door.  Or you fire up the computer and download the latest census data or a remotely sensed image from the National Agricultural Imagery Program.
But the concreteness is only an illusion.  Of course, there is the obvious abstraction of a statistical design.  Spatial and temporal arrangement of samples, the statistical models that will be used to detect any difference among samples are pretty well in mind as you pull on those boots, or following out footwear metaphor, boot up the computer. 
Erica Tauzer preparing to sample a vacant lot in Baltimore.
Lurking behind the statistical abstraction implied by data, are deeper theoretical and conceptual structures.  Data are as much conceptual as they are empirical.  What kinds of questions does girding for battle with data raise about the conceptual realm?  This is an important area of consideration during BES’s Year of Theory.
Data are first of all, framed.  They are collected within in a specified spatial extent, and represent a specified spatial grain size or temporal window.  The spatial or temporal interval between samples is also a kind of framing.  It is no mistake that the term “framework” is so important in discussions of theory.  That term acknowledges that framing, and specifying the relationships of data in the frame, are key tasks for theory.  In other words, framing specifies the scope and spatial and temporal texture of the area of interest.  The framing tells what the data are “for” or “against.”
Data collected are relevant to some model, and that model needs to be specified.  Models indicate the entities or processes of concern, how they are related, what the expected dynamics are, and what the potential outcomes are.  Models thus fill in the details of the working of the system within its specified frame.  Often, multiple models are employed to understand a system, as models work best when they have very specific scopes.  Consequently, complementary models that cover different scopes of the pattern or process of interest must be employed.
Data usually rely on some theoretical structure to determine what measurements are appropriate.   Measurement of temperature as a scientific variable would be useless without the theory of heat to explain what processes temperature can affect.  Further biological models, like that of Q10, expressing the relationship of endothermic versus exothermic metabolism to external temperature add richness to the role of temperature data. 
Theories are also key to comparison.  For example, in Baltimore bird biodiversity has been found to relate to vegetation in neighborhoods and nearby parks, while in Phoenix, bird diversity has been found to relate to wealth of neighborhood residents.  At first these seem to imply perhaps contradictory theories.  However, underwriting both relationships is the response of bird communities to vegetation structure.  It turns out that the social and historical drivers of the bird-vegetation relationship differ between the two cities.  A deeper theoretical structure is implied by the initial incongruity between the data of the two cities.
Another example emerges from the watershed approach used in BES.  Why do we measure the things we do in streams?  The watershed approach frames material fluxes as integrated by water within the boundaries of a catchment.  In BES, as in any urban system, piped water input, and the rerouting of water within the watershed in drains and storm sewers are model details that are required.  So the concreteness of data collection assumes the existence of infrastructure within the watershed.  Of course, it also assumes a patch structure that may influence the processing of materials – their transformation or transport – in the watershed.  Finally, the relevant theory suggests that limiting nutrients will be retained in by the biological processes in the watershed, while those that are not limiting will be passed through at levels reflecting their input and the flow resistance within the watershed.  The chemical forms, sizes of particles, and role in organismal metabolism are all details that determine how a material will behave.  In ecosystems outside of urban areas, these last ideas may be combined in the principle of ecosystem retention.  This emerging theory of urban watershed function explains the different behaviors of materials viewed as contaminants, pollutants by virtue of their excessive concentration, and indicators of human activity.
The frameworks of scientific theories are often depicted as nested hierarchies.  The most general form of content of a theory must contain more specific subtheories or models to translate their abstractions into measurables.  Likewise, even those translating theories and models may need to be further specified for very particular times and places.  Hence, the theories in between the most general and the most specific are of great importance.  They are called “midlevel theories” and are the locus of much interdisciplinary and integrative work.  The models of greatest detail may not translate well across disciplines, while the theories at their most general may offer only metaphorical encouragement for integration across disciplines.  Being attuned to different levels of abstraction is important in managing and linking different kinds of data.
Metacity theory provides an example of nested theories in urban ecology.  The most general level of the metacity states the phenomenon of interest: spatially heterogeneous and changing mosaics of urban systems.  This calls for three more specific kinds of theory: those that deal with the landscape mosaics in which fluxes acn be modeled, those that deal with the choices that people, institutions, and organisms make about where to locate or move in the urban compelx, and finally those that portray the combined outcomes of fluxes and cjoices.  Each of these three mid level theories would be supported by still more specific models.  For example, the flux mosaic might include models of human migration, biogeochemical nutrient flows, energy apportionment, and traffic.  Each of the other mid level theories could similarly be subdivided into more specific models.

Each set of data, and each relationship between one kind of data and another, calls for a statement of the framing assumptions, the model structures, and the more inclusive or general theoretical relationships.  Sorting this out, and articulating these relationships for all our supposedly concrete data sets, is a task for the BES Year of Theory.
Allen, T. F. H. and T. W. Hoekstra. 1992. Toward a unified ecology. Columbia University Press, New York.
Ahl, V. and T. F. H. Allen. 1996. Hierarchy theory: a vision, vocabulary, and epistemology. Columbia University Press, New York.
Pickett, S. T. A., J. Kolasa, and C. Jones. 2007. Ecological understanding: the nature of theory and the theory of nature. 2nd edition. Academic Press, Boston.