Tag Archives: “Export To Excel”

My Journey into PowerShell

One more feather added into my crown, recently I was started using PowerShell from Microsoft. The requirement is simple, our company has a huge pile of documents for a project and they want to put all of them into our newly commissioned document management system named Windchill. So, our consultants gave us a JAVA console application to load the documents into Windchill. That tool required a CSV file with predefined columns. So it all fall onto my head to prepare a CSV file. I was given the location where those documents are sleeping and I went there to only make my self sick. There are just 18000+ documents….. Since I have no other options, I was thinking about how I can prepare the CSV file out of these. That CSV file should have following,

  1. Document Name
  2. Location of document
  3. Document Number
  4. Document Version
  5. Process Code
  6. Sub-Process code
  7. Client document number

So how do I get these? going through each document and note down the details is not an option that I want to exercise. I am a programmer at heart, fellas… I want to do things simple enough so that I can manage.  With some keen observation I found that our engineering guys made my life much easier. They put all of them in the document name itself. So all I need is to parse the document name and put all that information into a CSV file. That’s easy I thought myself. All I need to write a program to do that. Now that’s the problem, my be I was too board of C# or may be I saw a geek video last night, I want to do it in much simpler way that too using some script. There, at that time I saw a blue icon with a command prompt symbol laying on my task bar. Its a long forgotten tool and I open it only when I saw any computer geek movies to only close it once more time. It is PowerShell from Microsoft. Ok, now it seems I got a solution for my problem problem. A quick web search revealed that this little guy can do many things that I want. so it’s time to learn new things. For me learning new things become a hobby and Its pretty exciting to learn new things. I started organizing my requirements, what I want from PowerShell, the algorithm…

  1. Go through all the folder recursively and read all document names
  2. List Name, revision, document number and physical path to the document.
  3. Push them into a CSV file and I am done….

Now my milestones to achieve this,

  1. List document names by recursively reading through the folders
  2. Check how to parse the document names and list them
  3. Finally push them into a CSV file

Listing document names recursively,

Fall back to my favourite search engine, BING and learned that “Get-ChildItem” will do the job for me. Here is my output Get-ChildItems

Ok, now I achieved my first goal and moving onto my next one. This time I want to list the documents recursively. Its easy part, “Get-ChildItem -recurse” will do the job. That’s cool, I crossed my second milestone too. Now my third step, how to sub-sting a string? We can achieve this by using “substring()” function. This function takes two parameters, one is from what index to start and another one is length of string to extract. See the following output, SubStirng

Ok, going good. Now I need to pump the document name to substring and list it as individual columns. But wait, my output is not correct, I want only two things, document name and document physical path. Once I have these two, I need to extract remaining columns from document name. One more things, I don’t want to bring the folder names but only files within them. So, its time to play with “Get-ChildItem“.

My immediate thing to do is exclude the directories, I can get this by “Get-ChildItem -Attributes !Directory“. Lets see the output;

Original:

Get-ChildItem-All

Here all highlighted ones are folders in the current folder. Now lets apply those switches;

Get-ChildItem-NoDIr

They are gone… Great now its time to change the output. I want only name of the document and full path to it. We can achieve this by piping the output into another commandlet, named, “Select-Object“. Here is the output;

Get-ChildItem-NoDIr-NameFull

At this point, I am able to get name of the document and full path to it. Now I need to extract the desired columns, remember I need to get document number, version.

Now a little confession to make, the output till now you are seeing in the above screenshots are not from production server. Therefore the names are not accurate. Actual file names are like this “TP123-8080-DWG-0001-Rev0 Some Name.Ext“. In this name, I am interested in first part of the file name. That contains the document number and revision and hopefully they are accurate. So all I need is to extract the file content till the first space. Let’s see how we can achieve this.

To make a substring from the input from Select-Object we write an expression as “{$_.Name.SubString(0, $_.Name.IndexOf(‘ ‘))}“. Look into the output;

Get-ChildItem-Expression_1

If you observe closely, the column name is nothing but our expression. But that’s not what we want, let’s give it a name. For that purpose, we need to change the expression slightly to “@{Name=”DOC_NUMBER”; Expression=$_.Name.SubString(0, $_.Name.IndexOf(‘ ‘))}}“. Following is the output;

Get-ChildItem-Expression

Here I need to refine the DOC_NUMBER expression to remove revision part. This is because revision number is not part of the document number. My refined expression will be “@{Name=”DOC_NUMBER”; Expression={$_.Name.SubString(0, $_.Name.SubString(0, $_.Name.IndexOf(‘ ‘)).LastIndexOf(‘-‘))}}

Great!!! Now lets add revision number to our list. To achieve this, we reuse the document number expression and take everything after last index of “-“. Since, the document number is has fixed length, i.e. 19 characters we can substring from 19th character to rest. Here is my expression “@{Name=”DOC_REVISION”; Expression={$_.Name.SubString(0, $_.Name.IndexOf(‘ ‘)).SubString(19)}}” and following is the output;

Get-ChildItem-Expression-REv

Now I am almost done, all I need is to export this to CSV. To do that we pump this output to another commandlet named, Export-Csv and passing the name with path. Here is my final PowerShell command and its output;

Get-ChildItem -Attributes !Directory -Recurs | Select-Object Name, FullName, @{Name=”DOC_NUMBER”
; Expression={$_.Name.SubString(0, $_.Name.SubString(0, $_.Name.IndexOf(‘ ‘)).LastIndexOf(‘-‘))}}, @{Name=”DOC_REVISION”
; Expression={$_.Name.SubString(0, $_.Name.IndexOf(‘ ‘)).SubString(19)}} | Export-Csv c:\logs\exp.csv

Excel-csv

That’s it fellas, I hope you learned something today. What I realized is, PowerShell is really powerful. It is integrated into system so much so that we can do virtually anything in the system.

Advertisements