CONCEPT Beta Release v3
README File
February 2005

CONCEPT Beta V3 contains the following changes:

1. Updates and corrections to the point and area models, especially in the
CAMx output structures
2. Met data processing and data input routines.
3. Spatial processing for area and point sources,
4. Spatial surrogate development

The processsing procedures for the area and point models has not changed
Processing requirements for the met data and spatial model are detailed in documents
included in this distribution:

Changes to CONCEPT to support Spatial Modeling.doc
CONCEPT Grid Definition Parameters.doc
Creating Area Source Spatial Surrogates.doc
Importing Met data into CONCEPT.doc

CONCEPT beta3 will require the installation of the following packages:

PROJ4
GEOS
PostGIS

The MM5 interface and MOBILE packages may also require a FORTRAN compiler.

1.1 Directory Structure

All CONCEPT code files are now in the concept subdirectory, and all data files for the
beta are in a separate subdirectory called concept_projects/beta3. This simplifies the
management of the source code under CVS (everything under the concept directory is in
source control, everything outside the concept directory is not).

The concept script has been moved to the concept directory (there is no longer a bin
directory). All executable code files other than the main concept script are now
in subdirectories under the src subdirectory. The main concept script and all called
scripts have been rewritten so that you can execute the concept script from any
directory. You no longer have to change directory to the concept location to run the
model. If you add the $CONCEPT_HOME directory to your PATH environment setting, you can
run CONCEPT from anywhere!

1.2 perl Must Be In Your Path

You will need the perl executable in your path. Although the perl installation recommends
a common location for the perl interpreter (/usr/bin/perl), some distributions do not
follow the guideline. Thus the shell scripts that run the perl programs call the perl
interpreter directly (rather than relying on the "shebang" notation - #!/usr/bin/perl).
If you are not sure whether perl is in your path, type the command "which perl" - if the
reply indicates "no perl in ..." then perl is not in your path. You can also try
"perl --version" and see if you get a "command not found" response. Please refer to your
shell documentation on how to add locations to your path environment variable.

1.3 Data Import System

The data import subsystem has been completely rewritten. The main import program is
now in perl (concept/src/import/import.pl). The fieldwidths.dat file has been renamed
field_defs.dat (it is in the concept/src/import directory), and it now defines the
field names, types (just C for character/date and N for numeric), and widths. The
import.pl program also processes the fields in the order given, so the file column
order no longer needs to match the database tables. In addition, all of the tables
have had their primary key constraints defined and defaults set for all NOT NULL fields
(except date fields, which have no good default value). The import.pl program only imports
the columns that are not empty in the input files - the remaining fields are not included
in the INSERT statement. This allows PostgreSQL to assign the proper default values to
missing or empty fields (e.g., country_code which is missing from a number of files and
which now defaults to 'US').

The importer also takes a new parameter "-t" which instructs the program to use
transactions to batch the insert statements. The positive side of this is that it runs
much faster when you use the -t switch. The negative side is that when using transactions,
the entire current transaction (set to 10000 lines of input data) is aborted if any type of
error occurs. This renders the entire import job invalid. The recommended approach is to
first try to import your data with transactions, then if you encounter errors either fix
the input files or rerun the import without transactions. When you run without transactions,
only the line with the error is skipped - all other lines are imported. One common type of
error is a duplicate primary key error. This error can safely be ignored, so you can run
the entire import without transactions and just ignore the duplicate lines. If you wish to
speed up the import process, delete the duplicate lines from the input files and run the
import with transactions.

One last note on the importer - if there are errors in the input, examine the console output
from the import routine. It will contain the type of error and the line number in the
input file where the error occurred. At present, only the first 5 errors are reported -
you may change this by editing the value of the $maxErrors variable on line 67 of the
src/import/import.pl script.

1.4 Run Control File

The run control file content has been expanded somewhat. The file now includes values
for area and point source data QA levels, and a debug parameter. The QA levels control
how detailed the QA is for the NEI data (QA routines have been defined for levels 1 and 2).
The debug parameter at this point only controls whether the detailed tables for temporal
allocation are created. If the debug parameter is set to 2 or higher, the temporal
processors create an additional table (area_temporal_debug and point_temporal_debug) that
contain the actual profiles, profile sources, and other detailed information for every
candidate emission record (nei_area_em or nei_point_em) for the model run. Creating
these tables adds to the runtime of each model, and the output is only required for
very detailed debugging.

1.5 24 Hours Per Record

In the previous release of the CONCEPT area and point source models, the output tables
were organized with one record per hour of the model run. This has been changed in this
release such that all output tables (starting with the final temporal allocation tables
an all tables created after that step) are created on a per-day basis with 24 hourly
emission values per record.

1.6 Numeric Scale and Precision

The NEI and RPO data tables were previously set up for unlimited numerical precision
using NUMERIC fields. This has been changed so that all number fields have their
scale and precision set to match the largest scale and precision allowed in the NEI
and RPO formats. All internally calculated values are stored as double precision FLOAT
fields.

1.7 Reference Data and NEI Inventory

The RPO and global reference files included with the second beta release are mostly
complete with the correct lookup data for the included NEI data. The NEI is for Kentucky,
and the spatial data is for the national 36k grid.


2.0 Running the CONCEPT Beta

Here are the steps for running the CONCEPT area source and point source model betas:

2.1 Make sure that your user id has the following environment variables set correctly:

PGHOME - should point to the install directory for PostgreSQL
PGDATA - should point to the directoy where the PostgreSQL data files reside
CONCEPT_HOME - should point to the concept subdirectory in the directory where you
unpacked the CONCEPT beta
CONCEPT_PROJECTS - should point to the concept_projects subdirectory in the directory
where you unpacked the CONCEPT beta
PATH - should contain the $CONCEPT_HOME directory and the PostgreSQL bin
directory. Should also contain the location of the perl executable.

2.2 If you have not initialized the PostgreSQL database system post-install (e.g., if
PostgreSQL came pre-installed on your system), you need to do so as follows:

su postgres -c "initdb -D $PGDATA"

NOTE - the syntax "su postgres -c ..." is used repeatedly in this README. If you
are unfamiliar with the command, please refer to it's man page for more information.
You can accomplish all of these commands directly as the postgres user if you prefer.

2.3 If necessary, start the PostgreSQL database system:

su - postgres
/usr/local/pgsql/bin/postmaster -D /usr/local/pgsql/data >logfile 2>&1 &
exit

This command will place the PostgreSQL log file ("logfile") in the directory from
which you ran the command. If you prefer a different location, or want to use a
different log file name, replace logfile with the appropriate file specification.
Please refer to the PostgreSQL documentation for additional options.

2.4 If necessary, create the concept group in PostgreSQL:

su postgres -c "psql -c 'create group concept' test"

This only needs to be executed once regardless of the number of projects you create.
NOTE - the parameter "test" refers to the initial test database that you set up
during the PostgreSQL install. If you used a different database name, you can
substitute the name of any valid PostgreSQL database instead.

2.5 If you will be running CONCEPT as a user other than the PostgreSQL owner, add your
user id to the concept group in PostgreSQL (this is not a unix group):

su postgres -c "createuser john"
su postgres -c "concept add_user -n test -u john"

This only needs to be run once per user, regardless of which project you are working
on. Replace "john" with your unix user id. When you run the first command, answer
yes to the permissions questions that arise.

2.6 Create a CONCEPT project for the beta test. This will create a database called "beta3":

su postgres -c "concept create_project -n beta3"

Assumes you are not logged in as the postgres user, and that the database is running.
If you have previously created the beta3 database, either use a different name for
the project, or drop the original project using the following command:

su postgres -c "concept drop_project -n beta3"

2.7 Initialize the beta3 project:

concept init_project -n beta3

This command creates the project database, and creates the global lookup tables and
stored procedures.

2.8 Each CONCEPT run is executed for a specific scenario. Create a scenario for the
beta test:

concept add_scenario -n beta3 -s scenario1

This command adds the scenario schema and creates the scenario-specific tables.

2.9 Import the global reference data:

concept import_globals -n beta3 -d $CONCEPT_PROJECTS/beta3/globals -t

The global reference data provided for the beta3 is "clean" so you can use transactions
(with the "-t" switch) to speed up the import.

2.10 Import the RPO cross-reference data (again, with transactions):

concept import_rpo -n beta3 -d $CONCEPT_PROJECTS/beta3/rpo -t

2.11 Import the control file for the scenario:

concept import_control -n beta3 -s scenario1 \
-c $CONCEPT_PROJECTS/beta3/scenario1/run_control.txt

2.12 Import the scenario-specific inventory data:

concept import_nei -n beta3 -s scenario1 -d $CONCEPT_PROJECTS/beta3/scenario1/nei

If you prefer, you cna import the area and point source data independently:

concept import_nei_area -n beta3 -s scenario1 -d $CONCEPT_PROJECTS/beta3/scenario1/nei
concept import_nei_point -n beta3 -s scenario1 -d $CONCEPT_PROJECTS/beta3/scenario1/nei

2.13 Run the QA routines:

concept qa_nei_area -n beta3 -s scenario1
concept qa_nei_point -n beta3 -s scenario1

These commands run the qa routines on the NEI input data, set some default values in the NEI
and RPO tables, and prepare the integer keys for the NEI data.

2.14 Run the area model:

concept run_area_model -n beta3 -s scenario1 -d output_directory

This command executes the three area source modules and generates the CAMx
output files and the other outputs. The output files are written to the
directory specified by the -d parameter.

2.15 Run the point model:

concept run_point_model -n beta3 -s scenario1

This command executes the three point source modules and generates the CAMx
output files and the other outputs. The output files are written to the
directory specified by the -d parameter.

2.16 Optional - a utility procedure is provided that counts the records in all of the
intermediate and final output tables. You can run it from the psql program:

psql beta3
beta3=> set search_path=scenario1,xref,globals;
SET
beta3=> select table_counts();

Once this procedure is complete, the table "table_counts" contains raw counts and
some summarized counts by table. You can create a file with this information at
the shell prompt:

psql beta3 -c "select * from scenario1.table_counts" > table_counts.txt

NOTE - you must run both the area and point source models before running the
table_counts procedure.


3.0 Optional Utility Scripts

In the $CONCEPT_PROJECTS/beta3 directory you will find four utility scrips for
running the CONCEPT betas. Make sure your environment is set up according
to the requirements listed in section 2.1. The scripts are as follows:


init_beta3.sh deletes and adds the "scenario1" scenario, initializes
the scenario, runs the data imports, and runs the qa
routines - the execution is logged to the file
log.init_beta3.

The file example_beta3_command shows example syntax for all of the current concept commands

4.0 Administration Notes

The CONCEPT model creates and deletes many tables during execution. If the model
performance begins to deteriorate, you can run the vacuumdb command to clean up
unused space. The command must be run as the postgres user:

vacuumdb beta3

At some point, this may no longer improve performance to the degree desired - at
this point you should drop the CONCEPT project and recreate it:

su postgres -c "concept drop_project -n beta3"
su postgres -c "concept create_project -n beta3"