Exporting NX Data To Excel
Exporting NX Data To Excel
■ Introduction
There are many situations where it is useful to exchange information (in either direction) between NX and Excel.
For example, you might want to export an NX bill-of-material or other report to Excel. Alternatively, you might want
to import data from an Excel spreadsheet and use this to assign values to attributes in NX. There are two somewhat
separate steps in the data exchange process: getting data into and out of NX, and getting data into and out of Excel.
These two steps are both discussed in this chapter, even though the second one is not really related to NX.
There are (at least) three different ways to programmatically “push” data to Excel:
The Automation Approach: use the Excel API to write data into an Excel document.
The CSV Approach: write a csv file that can then be imported into Excel.
The XML Approach: write an XML file that can then be imported into Excel
These are discussed in the sections below. However, note that the first and second of these are somewhat related,
so you should read about the Automation Approach before reading about the CSV Approach.
Excel Macros
The functions in the Excel API (whether wrapped or not) correspond very closely with the functions of interactive
Excel. You can understand this correspondence by recording macros in Excel and examining their contents. For
example, if you record the actions of typing “xyz” into cell C3, and making it bold, your macro will contain the
following code:
Range("C3").Select
ActiveCell.FormulaR1C1 = "xyz"
Selection.Font.Bold = True
This code uses the VBA language (Visual Basic for Applications), but translating it into VB.NET or other languages
is typically straightforward. This is the same process that we use to discover functions in the NX/Open API, though
NX has the added advantage that it can record macro code in several language s, not just in VBA.
Example
This example shows you how you can write data into an Excel spreadsheet and format the cells. We use a fictitious
set of part records to illustrate the techniques. Each part has an 18-digit part number, a weight, a cost, and a
purchase date. An array of part records is returned from a function called GetParts, which you have to provide. In
an NX scenario, the GetParts function would probably read data from NX attributes.
Getting Started with SNAP Chapter xx: Sharing Data with MS Office Page 1
To use the code shown below, you will need to have a reference to Microsoft.Office.Interop.Excel. As its name
suggests, this is the interoperability assembly for Excel, and you can find it on the .NET tab of the Add Reference
dialog in Visual Studio, as shown below:
On the COM tab, you can find another Excel library, called Microsoft Excel 14.0 Object Library. This will work, too,
but the one on the .NET tab is preferable. Then, the code is as follows:
End Sub
End Class
Most of the code is straightforward, and should be easy to understand. One thing to note is that cell numbering in
Excel starts at 1, not at 0. So, cell “C2” is Cells(2,3), for example.
Getting Started with SNAP Chapter xx: Sharing Data with MS Office Page 2
The only real mystery is the last step, where we “clean up COM objects”. This step is necessary because we are
using (indirectly) a COM API. When our VB code defines objects like the Excel application, the workbook, and the
worksheet, hidden COM objects are created, and the normal .NET garbage collection is unable to handle these. So,
when we are finished with these objects, we have to take care of destroying them and free-ing their memory. Some
code to do this cleanup is shown below. Don’t worry if you don’t understand this code; many experienced
programmers don’t understand it, either. Just place this code inside your AutomateExcel class, accept that it’s
necessary, and try not worry about it too much.
cells = workSheet.Columns(1)
cells.NumberFormat = "@" ' Format column #1 as text
and run the code again. The result will be this spreadsheet:
Since we are no longer providing any help, Excel tries to make a guess about the contents of column #1, and it
guesses that they should be numbers, and stores them internally as numbers. But, Excel numbers only have around
15 digits of precision, so, as the display in the formula bar shows, the last 3 digits of each part number have been
lost. Clearly, text formatting (and text storage) is needed, here.
Getting Started with SNAP Chapter xx: Sharing Data with MS Office Page 3
Their main disadvantage is that they contain no formatting information. However, some simple formatting options
can be specified when importing the data into Excel using the Text Import Wizard:
Roughly these same formatting controls are available if the file is imported into Excel programatically by calling the
OpenText function. This function has the following arguments:
void OpenText(
string Path,
[object Origin = System.Type.Missing],
[object StartRow = System.Type.Missing],
[object DataType = System.Type.Missing],
[Excel.XlTextQualifier TextQualifier = Excel.XlTextQualifier.xlTextQualifierDoubleQuote],
[object ConsecutiveDelimiter = System.Type.Missing],
[object Tab = System.Type.Missing],
[object Semicolon = System.Type.Missing],
[object Comma = System.Type.Missing],
[object Space = System.Type.Missing],
[object Other = System.Type.Missing],
[object OtherChar = System.Type.Missing],
[object FieldInfo = System.Type.Missing],
[object TextVisualLayout = System.Type.Missing],
[object DecimalSeparator = System.Type.Missing],
[object ThousandsSeparator = System.Type.Missing],
[object TrailingMinusNumbers = System.Type.Missing],
[object Local = System.Type.Missing])
The square brackets indicate optional arguments, as usual. The meanings of the arguments are as follows:
Getting Started with SNAP Chapter xx: Sharing Data with MS Office Page 4
Semicolon Object True to have the semicolon character be a field delimiter.
(Boolean) The default value is False.
Comma Object True to have the comma character be a field delimiter.
(Boolean) The default value is False.
Space Object True to have the space character be a field delimiter.
(Boolean) The default value is False.
Other Object True to have the character specified by the OtherChar
(Boolean) argument be a field delimiter. The default value is False.
OtherChar Object (required if Other is True). Specifies the delimiter
(String) character when Other is True. If more than one character
is specified, only the first character of the string is used.
FieldInfo Object(2, n) A two-dimensional array containing parse information for
individual columns of data. See below for further details.
TextVisualLayout XlTextVisualLayoutType The visual layout (direction) of the text. The default is the
system setting (I think) which will usually be
xlTextVisualLTR (left-to-right), unless you are using a
language like Hebrew.
DecimalSeparator Object The decimal separator that Microsoft Excel uses when
(String) recognizing numbers. The default setting is the system
setting.
ThousandsSeparator Object The thousands separator that Excel uses when recognizing
(String) numbers. The default setting is the system setting.
TrailingMinusNumbers Object Specify True if numbers with a minus character at the end
(Boolean) should be treated as negative numbers. If False or omitted,
numbers with a minus character at the end are treated as
text.
Local Object Specify True if regional settings of the machine should be
(Boolean) used for separators, numbers and data formatting.
Some of the more puzzling parameters are described in detail in the paragraphs below.
Origin
This can be one of the following XlPlatform constants: xlMacintosh, xlWindows, or xlMSDOS. Alternatively, this
could be an integer indicating the number of the desired code page. The allowable integer values are shown in the
“File origin” menu in the Text Import Wizard:
If this argument is omitted, the method uses the current setting from the Text Import Wizard.
TextQualifier
This is a character that can be used to enclose a sequence of characters, thereby forcing them to become one cell,
even if they include a delimiter character. For example, suppose that commas are being used as delimiters. Then the
string 1,260 would be split into two cells, even though the intention is probably to create a single cell containing
the number 1260. Similarly, we would probably want to force the string “Monday, July 4th” to be a single cell. Of
course, there is no need for a TextQualifier if you choose delimiter characters that don’t appear within the data
itself.
So, to make three cells from the three numbers 1,260 1,261 1,262, there are two possible approaches:
(1) If you have control over how the file is generated, create it with semicolons or some other characters as
delimiters (not commas), like this: 1,260; 1,261; 1,262. You can then use
TextQualifier = Excel.XlTextQualifier.xlTextQualifierNone. This is usually the easiest approach.
Getting Started with SNAP Chapter xx: Sharing Data with MS Office Page 5
(2) If you’re forced to use commas as delimiters, then you must use a TextQualifier to properly group the data. For
example, you might use TextQualifier = Excel.XlTextQualifier.xlTextQualifierDoubleQuote, and write
your input data as “1,260”, “1,261”, “1,262”.
The three allowable values of XlTextQualifier are:
Excel.XlTextQualifier.xlTextQualifierNone
Excel.XlTextQualifier.xlTextQualifierSingleQuote
Excel.XlTextQualifier.xlTextQualifierDoubleQuote
There is widespread confusion about the TextQualifier argument, possibly as a result of its poorly chosen name.
Many people think that using this argument will force Excel to format the enclosed strings as text (rather than as
numbers). This is not correct. To force Excel to format data as text, you must use the “FieldInfo” parameter.
FieldInfo
This is a two-dimensional array indicating how various columns of data should be parsed and formatted during
import. It is easiest to think of it as a list of pairs of the form (columnNumber, dataType), where columnNumber
indicates which column is under consideration, and dataType is one of the enumerated values from
Excel.XlColumnDataType. The most interesting values of this enum are:
Says that
columns #1 and #4 (the “A” column and the “D” column) should be formatted as text,
column #3 (the “C” column) should be formatted as dates
All other columns should be parsed and formatted as “general”.
The order of the pairs doesn’t matter. The code
gives the same result as the code above. If there's no column specifier for a particular column in the input data, the
column is parsed with the General setting, which means that Excel will try to guess the correct format. If the
column contains strings that Excel can recognizes as dates, for example, then this column will be formatted as dates
even though you specified a “general” format or no format at all.
If you specify that a column is to be skipped, you must explicitly state the type for all the other columns, or the data
will not parse correctly.
The xlDMYFormat date format seems to have some bugs, but the xlMDYFormat one works fine. Having spaces at the
beginning of a date field will confuse the parsing, just as it does when typing into Excel.
Getting Started with SNAP Chapter xx: Sharing Data with MS Office Page 6
In its full glory, a call would look something like this:
app.Workbooks.OpenText(
pathName, origin, startRow,
dataType, textQualifier, consecutiveDelimiter,
useTab, useSemicolon, useComma, useSpace, useOther, otherChar,
myFormat,
textVisualLayout,
decimalSeparator, thousandsSeparator, trailingMinusNumbers, local)
But we can take advantage of Visual Basic’s ability to omit optional arguments, and write this, instead:
app.Workbooks.OpenText(
pathName,
Semicolon := True,
DataType := Excel.XlTextParsingType.xlDelimited,
FieldInfo := myFormat)
The “:=” syntax is used to give values to optional named arguments. Some people use sequences of commas when
omitting arguments, or they use System.Type.Missing as a placeholder, but the approach shown above seems
easier to read and less error-prone.
A Simple Example
Suppose we have the following simple text file, containing part records:
123456123456123456; 14.75; 1,995 ;2/3/2012
234567234567234567; 2.75; 675 ;6/11/2012
345678345678345678; 0.25; 69 ;12/17/2011
The fields represent the part number (18 digits), the weight, the cost in U.S. dollars, and the purchase date. As you
can see, semicolons are used as delimiters. Using commas would complicate things since commas also appear
within the cost field.
Getting Started with SNAP Chapter xx: Sharing Data with MS Office Page 7
The code to import this file would be as follows. Again, note that you’ll need a a reference to the
Microsoft.Office.Interop.Excel assembly to use this code
Class OpenTextExample
' Import the data file, which will add a new item to the Workbooks collection
Dim workBooks As Excel.Workbooks = app.Workbooks
workBooks.OpenText(path, DataType:=myType, Semicolon:=True, FieldInfo:= myFormat)
End Sub
End Class
Specifying the DataType as XlDelimited is necessary, or else Excel will interpret the file as having fixed-width
fields. As you can see, we have asked for the first column to be parsed and formatted as text. Without this request,
the part numbers would be interpreted and stored as numbers, which would cause problems. Also, we need to
specify the “general” format for the second column, or else Excel will mysteriously interpret the 2.75 in cell B2 as a
date (February 1st 1975). Please refer to the discussion earlier in this chapter for more information about the
Cleanup function.
Getting Started with SNAP Chapter xx: Sharing Data with MS Office Page 8
Once we have imported the data, we can use other Excel API functions to format it further, as we saw earlier in this
chapter. To do this, replace the code after the workBooks.OpenText line with the following:
cells = workSheet.Columns(1)
cells.Font.Bold = True ' Make column #1 bold
cells.ColumnWidth = 22 ' Adjust its width
cells = workSheet.Columns(2)
cells.NumberFormat = "0.00" ' Format column #2 with 2 decimal places
cells = workSheet.Columns(3)
cells.NumberFormat = "$#,##0" ' Format column #3 as currency
cells = workSheet.Columns(4)
cells.NumberFormat = "dd-mmm-yyy" ' Change the date format in column #4
Note the little green triangles in the “C” column. Excel is telling us that the items in this column look like numbers,
but we have formatted them as text, which might be a mistake. It’s not a mistake, in this case, of course, but Excel
gives us the helpful hint, anyway.
Again, Excel has stored the part numbers as numbers, rather than text, so we have lost the last three digits, and no
subsequent reformatting operation will be able to recover them.
Getting Started with SNAP Chapter xx: Sharing Data with MS Office Page 9
container. The overall scheme is called OpenOffice XML, and the constituent pieces use formats called
SpreadsheetML, DrawingML, and so on. Microsoft provides a software library called the Open XML SDK containing
functions that make it easier to work with the XML data. For many scenarios, this is now the recommended way of
reading and writing MS Office documents.
One advantage of the XML-based approach is that it allows a spreadsheet document to be created and formatted
without using the Excel API functions, which means that it will work on a machine that has no access to Excel itself.
The same is true of the CSV-based approach, to some extent, although you will need to use Excel API functions if
you want to do any formatting or other operations.
<example>
http://blogs.msdn.com/b/brian_jones/archive/2008/11/04/document-assembly-solution-for-spreadsheetml.aspx
Getting Started with SNAP Chapter xx: Sharing Data with MS Office Page 10