Apache Beam – Bounding Data to Big Query

We have what seems like a straight forward task.

Take data from Google Pub/Sub and push it into Big Query using Apache Beam.

However we hit a few issues and here’s my high level breakdown of what we discovered.

Google Cloud Dataflow becomes open source

To connect Pub/Sub to Big Query originally you would use Google Cloud Dataflow, however in early 2016 this became open source. What does this mean? Old documentation on Googles sites and new documentation on Apaches sites.

When searching for answers you seem to hit a maze of old Google pages, various Apache Beam versions (with what appear to be breaking changes in some cases), and very few useful blog posts.

At the time of writing the latest documentation is here however always confirm you’re looking at the latest:-

https://beam.apache.org/documentation/sdks/javadoc/2.4.0/overview-summary.html

My kingdom for a usage!

The Apache documentation has a fair bit of information in it however it does not have any usages.  So figuring out how to actually use some of the classes is a nightmare. I found odd snippets of code provided me answers more often than I’d like to admit.

Hopefully going forward there is some more usage examples in the Beam documentation.

Where are all the “Experts”

When you go to stack overflow there is a lot of questions with no comments and no answers. It’s concerning as it shows that Apache Beam hasn’t been widely adopted by the general community yet.

Bounding

So the crux of our issue was this, We needed bound data.

Bound data in Beam terms is data that is essentially chunked to known sizes. We discovered pretty quickly that writing from a file our data would push to Big Query immediately which is great while reading from Pub/Sub would stream the data.

The reason being (obviously in hindsight!) that the data is coming from a Subscription! So its pushing data when it likes and is not Bounded.

The Apache documentation talks a lot about various ways to make Unbound data bounded such as setting max amount of records, set max read time, or just set the IsBounded flag on the PCollection.

None of these worked for us.

Why Bounded?

Streaming data in Big Query comes at a cost. Pushing Bound data is free depending on the amount of pushes you are doing a day.

The Solution!

To get Unbound subscription data to be bound and sent to Big Query we ended up using the following code:-


p.begin()
.apply("Input", PubsubIO.readAvros(MyDataObject.class).fromTopic("topicname"))
.apply("Transform", ParDo.of(new TransformData()))
.apply("Write", BigQueryIO.writeTableRows()
.withMethod(BigQueryIO.Write.Method.FILE_LOADS)
.withTriggeringFrequency(org.joda.time.Duration.standardSeconds(2))
.withNumFileShards(1)
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
.withSchema(schema)
.to(table));
p.run().waitUntilFinish();

NOTE:- withNumFileShards is MANDATORY for withTriggeringFrequency however this isn’t documented anywhere. If you don’t include it you get an IllegalArgumentException with a message of “null”.
There is a jira for this https://issues.apache.org/jira/browse/BEAM-3198

SSRS Date Format / Localization

I find myself always falling into this trap.

We in Australia like most of the civilised world like our dates in DD/MM/YYYY format.

By default SSRS is MM/DD/YYYY which really only bothers me because it doesn’t recognise my data as valid dates! (13/12/2018 for example).

To fix this open your report, click on any artifact on the report (image, text box, table etc). Then in your Properties find “Report”.

In Report you’ll find “Language” which defaults to blank.

Reports - Microsoft Visual Studio

C# – Memory stream is not expandable

This COMPLETELY blew my mind!

I had the following code:-


using (MemoryStream contentStream = new MemoryStream(mydata))
 {

      using (WordprocessingDocument myDoc = WordprocessingDocument.Open(contentStream, true))
      {

           MainDocumentPart mainPart = myDoc.MainDocumentPart;

           // Do some stuff

         mainPart.Document.Save();
           myDoc.Close();

      }

}

On mainPart.Document.Save I received the error “memory stream is not expandable”.

I was really confused until I found this post

https://www.codeproject.com/Questions/325125/memory-stream-can-not-be-expendable

If you include your buffer into the constructor then the stream is not extendable. See below:-

// This CANNOT be extended
MemoryStream stream = new MemoryStream(data);

// This CAN be extended
MemoryStream stream = MemoryStream();
stream.Write(data, 0, data.Length);

So in my case I needed to do the following:-


using (MemoryStream contentStream = new MemoryStream())
{
   contentStream.Write(mydata, 0, mydata.Length);
}

Then everything was fine. So beware!

What was gone is back again. VS Code Icons reverted

vslogos

In what is a welcome relief to most developers the VS Code team have decided to revert to the Blue and Green Icons of the past. Away from the Orange icons that caused such annoyance.

Many of us run VS Code and VS Code Insiders side by side and the orange icons weren’t exactly condusive to understanding the different and after the feedback (and with a very apologetic blog post!) we’re back to the old colors!

Read the story here:- https://code.visualstudio.com/blogs/2017/10/24/theicon

Why multiple offices can be a great thing

I work for an organisation with staff that stretch the globe. My boss has a strong appreciation for great staff and when told that some are going overseas he often asked them to continue working for the company. Most of our work is done out of three core offices and staff obviously at times are required to do work that might overlap from one office to another.

In 2017 the breadth of technology available for staff integration is staggering and it creates an environment that makes the world seem a lot smaller. Half of the staff I work with I have never seen in person however I would recognise their face if they walked past or know their voice in a crowd.

Distance is no longer the barrier it once was. Instead of being a hindrance to productivity I have found that different offices has enhanced it. Like meeting up with a friend who you haven’t seen in a long time, having a staff member come from a different office may in the short term affect productivity (lunches might go a bit longer than normal!) however it also comes with a long list of benefits.

24/7 Development Process
We can now have essentially a full time development cycle.  We have designers and developers in different time zones which can be frustrating at times however also leads to fixes and designs being done at all hours.  Need a Logo done? Get it designed over night? Got an urgent bug fix at 6pm? Speak to the developers in the middle of their workday.

In depth Discussions about work
The standard water cooler talk is replaced with questions surrounding why the person is in town.  By working in different offices I have found that it actually increases knowledge sharing as you are discussing the latest technologies regularly. Perhaps working in different offices reduces some of that personal friendships between staff however that is replaced with engaging conversation that is more work natured.

A simple kick in enthusiasm
The buzz in the office when someone visits immediately changes. Its a little shot of adrenaline for the team because it is a break from the ordinary. Work takes up so much of our time finding regular ways to increase staff motivation can be hard but necessary. There is a noticeable pep in the staff for the days following a visit.

I think we can all agree that there are times where a separated organisation can be a large negative however it is no longer the blocker to productivity that it once was. Starting offices all over the globe for the reasons mentioned above might not be the best idea however ensuring that your offices work together can introduce a fresh dynamic to the standard 9 to 5.

Dynamics 365 – Xrm.Page.ui.setFormNotification

To add notifications on Forms you can use the very handy Xrm.Page.ui.setFormNotification.

This will allow you to show the following alerts.

2017-08-21 14_34_25-Clipboard

To add them use the following:-

Xrm.Page.ui.setFormNotification(“Important Test”, “ERROR”)
Xrm.Page.ui.setFormNotification(“Warning Test”, “WARNING”)
Xrm.Page.ui.setFormNotification(“Info Test”, “INFO”)

NOTE:- Alot of blog posts out there have the last one as:-

Xrm.Page.ui.setFormNotification(“Info Test”, “INFORMATION”)

This is NOT correct and will mean the information icon won’t appear.

See the official documentation here:-

https://msdn.microsoft.com/en-us/library/gg327828.aspx#BKMK_setFormNotification

Dynamics CRM – Window OnLoad Unable to get property ‘value’ of undefined or null reference

We recently had a problem where we were receiving the following error in a single environment only.

Error

Due to it only occurring in one environment I presumed it was a deployment issue. However I was incorrect.

We were receiving the above error because we restored a full database from one environment to another. We had some JavaScript which was checking the owner of a record. The owner was displayed on the record however it was from an external domain.

Because of this it threw an error right down in the CRM depths at:-

function crmForm_window_onload_handler(eventObj,eventArgs){

  try{

    var eContext=Mscrm.FormUtility.constructExecutionObject(eventObj,0,null,null);

    eContext=Mscrm.FormUtility.constructExecutionObject(eventObj,0,null,eContext)

    loadInsideView();

    eContext=Mscrm.FormUtility.constructExecutionObject(eventObj,1,null,eContext)

    CEI.Initialize();

  } catch(e) {

    displayError('window', 'onload', e.description);

  }
}

So if you see an error occurring like this that did not appear in other environments make sure you confirm that the record you are looking at is correctly mapped to the new domain.

Dynamics CRM – 0x8004F016 – A managed solution cannot overwrite the LocalizedLabel component with Id=XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX which has an unmanaged base instance. The most likely scenario for this error is that an unmanaged solution has installed a new unmanaged LocalizedLabel component on the target system, and now a managed solution from the same publisher is trying to install that same LocalizedLabel component as managed. This will cause an invalid layering of solutions on the target system and is not allowed.

We had a client who had a very old version of our CRM customizations managed solution. When we attempted to update we received the following error.

ErrorCode:
0x8004F016

ErrorText:
A managed solution cannot overwrite the LocalizedLabel component with Id=XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX which has an unmanaged base instance. The most likely scenario for this error is that an unmanaged solution has installed a new unmanaged LocalizedLabel component on the target system, and now a managed solution from the same publisher is trying to install that same LocalizedLabel component as managed. This will cause an invalid layering of solutions on the target system and is not allowed.

Checking the internet the responses seemed to be based around deleting data from the SQL database. This is not recommended for two reasons. 1) Its not microsoft supported 2) It doesn’t work for CRM Online.

The item that was failing was:

Form Incident Case

Investigating the forms related to Case I found that the name of forms had been changed i suspect by an unmanaged customization. By changing the name back to the correct name I could then import my managed solution correctly.

Dynamics CRM – Update Opportunity on ‘Close as Won’ or ‘Close as Lost’

I had a requirement that I thought would be quite basic, on Close Won/Lost then update a field on Opportunity.

My first thought was that this could be on using the following Plugin setup:

  • Message – Update
  • Primary Entity – opportunity
  • Secondary Entity – none

However I found that the above did not fire. A bit of googling and I found that i instead needed to use the following:

  • Message – Win
  • Primary Entity – opportunity
  • Secondary Entity – none

AND

  • Message – Lose
  • Primary Entity – opportunity
  • Secondary Entity – none

This does not return the opportunity as “Target” as you might think it actually returns “Status” and “OpportunityClose”. OpportunityClose contains the information listed on the form displayed when you try to Win/Lose an Opportunity, CRM stores this as a separate entity.

Opportunity does provide you with the related opportunityid however so you can update the original opportunity if required (Just be aware that THIS will then fire any “Update opportunity” plugin steps!)

var context = (IPluginExecutionContext)serviceProvider.GetService(typeof (IPluginExecutionContext));


if (context.InputParameters.Contains("OpportunityClose")
 &&
 context.InputParameters["OpportunityClose"] is Entity)
{
 // Obtain the target entity from the input parameters.
 var entity = (Entity)context.InputParameters["OpportunityClose"];

 try {

  if (entity.Attributes.Contains("opportunityid")) {
   var opportunityId = ((EntityReference)entity.Attributes["opportunityid"]).Id;

   var serviceFactory =
    (IOrganizationServiceFactory) serviceProvider.GetService(typeof (IOrganizationServiceFactory));

   // Getting the service from the Organisation Service.
   IOrganizationService service = serviceFactory.CreateOrganizationService(context.UserId);

   Entity opportunity = service.Retrieve("opportunity", opportunityId, new ColumnSet(new string[] { "closeprobability" }));
   opportunity["closeprobability"] = 50;
   service.Update(opportunity);
  }
 }
 catch (Exception ex) {
  tracingService.Trace("Plugin: {0}", ex.ToString());
  throw;
 }
}