PGSQL Phriday #005 Relational VS Non Relational data. 

I always kinda see a fight in my mind when someone mentions Relational/Non-Relational. I took this picture long ago and some good friends are posing for me. 

This post is in relation to the #PGSQLPhriday blogging series. The original post for this discussion is located here. Thank you Ryan Lambert for such a great topic and for hosting this month! 

Ryan posted a challenge on how are you using Postgres. I felt like for this post instead of focusing on one question. I would simply answer all of them! I really liked all the questions so I’m just going to answer them! 

What non-relational data do you store in Postgres and how do you use it?

In the current company/PG servers, I work with and deal with. We have a great number of JsonB columns that house Non-relational data. I am newer to this company so really didn’t have much of a say in the design of this particular aspect. Currently, it’s primarily used as a logging/settings store in those JSONB columns. It was a pattern that was created with good intent but misused. Turning into the Data Junk Drawer. Unfortunately, the data is needed in many cases and so you have procedures/functions out there trying to query the data out of these fields and use it in various cases. This is not an ideal use for it. 

One good example of non-relational data used well. I did some contract work for a document scanning company that would use OCR to read most of the data/metadata about the documents and put them into table structures and then store a link to the actual document in a file store. This link was added to the row so the application could retrieve the document but it wouldn’t be actually stored in the DB. The performance issues they called me for were still related to the data they were trying to scan but it was mostly bad indexing. Once we corrected the indexes everything worked much better and they had a good system that would scale reasonably well. 

Have you attempted non-relational uses of Postgres that did not work well? What was the problem?

Yes, Besides the current system, I mentioned above for the “Junk Drawer”. I also ran into another company that did store logos/graphics/images for the website in the DB. The retrieval from the DB wasn’t really horrible since they were pulling exactly a row/column from the DB, the real issue came along with caching. Websites can’t take 2 minutes to pull up images and documents and they depend on the cache to make them run fast. The web servers of the time couldn’t cache the documents since they were DB calls and not static documents. This means that each website load had to re-pull the data. Once this system started scaling and getting lots of requests the web servers couldn’t keep up re-loading the images over and over. In the end, they had to move this out of the DB and into the Webserver cache where it belongs. I strongly urge all of my customers to never store images/graphics in the DB at all if they can avoid it. 

What are the biggest challenges with your data, whatever its structure?

Right now the biggest challenge for the current project/company I work for is just related to DB sprawl. We have 10+ years of the DB being designed with a monolithic application that added new tables all into the same DB. Only a few of the functions/design of the application was broken apart into separate pieces, separate schemas, or really separated at all. This is not uncommon in the industry since many companies that have been building software for 10-15 years started out with monolithic applications and rarely do we go back and focus on the technical debt side of things. Moving systems like this to a Service Oriented Architecture or even just breaking it up into different pieces takes time and a lot of cultural change for a company. It’s always a long-term project and usually, in a larger company it will take years to complete. It’s a fun challenge to have and you can find small wins by breaking off smaller pieces and starting on new products/new features in a new system. 

How do you define non-relational data?

I like to define data based on how it’s used. I tend to see a lot of various data with companies so I tend to keep an open mind and just base my definitions on how the data is used. An example of this is Search, We all know PG and many RDBMSs can have indexes and search very quickly through millions of rows of data. I’ve found in past experiences that moving heavy search functions off to something like Elasticsearch has been a great way to speed up search even more. RDBMSs are really not great at doing textual style searches vs something like an Elasticsearch engine. It’s an excellent example of using the right tool for how the data is being used. 

I really enjoyed this #PGSQLPhriday, I hope more in the community will join in the fun and join us in a future #PGSQLPhriday. Thank you Ryan Lambert for the great topic! 

Advertisement

Speaker Idol

My first attempt at a little AI art, Made with Open AI Dall E 2.

This was a write-up I did for my company to explain what Speaker Idol was and how it could be used to get more speakers for our internal training at the company. Since I was writing it up for internal use, I would also go ahead and make a blog post.

Speaker Idol is a concept/idea that allows a group to gather more speakers for future events.  If you are at a meetup and you are trying to fill your speaking roster for the year, this gives you a method to do that.  It also helps to grow your speakers and help new speakers get started.  The goals of Speaker Idol are simple. 

  1. Give a new speaker the chance to speak in a safe, supportive environment, and with minimum preparation time.  This removes the barrier to entry for new speakers. 
  2. Grow the speaker pool you have available to you as an organizer of events. Whether that’s a user group/meetup/conference.  
  3. Teach new speakers with constructive feedback about how to improve the session. 

This is not a new idea, I do not take ownership of coming up with this idea. I used this successfully in a few user groups I’ve run and have continued to use it.  It is another tool that community individuals can use to grow the community around them.  

What is a Speaker Idol presentation? 

Put simply it’s a full presentation where you only present one portion of it.  

New speakers tend to have a big idea! Perhaps something like “RDBMS Internals”.  This is a little large for a first-time speaker to be taking on. It could literally be hundreds of presentations to make that concept work. But if they can think of one specific idea like, “Pros and Cons to clustered indexes”. This is something that could be one part of a presentation called “Indexes in an RDBMS”, in that larger presentation you might be talking about many different types of indexes and you intend 1-2 slides on the “Pros and Cons of clustered indexes”.  Perhaps you had a demo to show the advantages and what’s good and bad about them.  

This is how you would break down a speaker Idol presentation.  For a brand new speaker, we can coach them that they just need 1 idea, 1 demo, and a few slides to talk about the presentation. Once they have gotten over this hurdle they can then present the full presentation. They also have the building blocks and time to build the rest of the presentation.  

Grow the Speaker Pool

Once someone presents a short presentation at Speaker Idol the goal is to have them come back to the group and do the full presentation. Consider the short presentation as a preview of the real thing. This helps to grow speakers, it also gives organizers more presentations to fill out the year. It is your responsibility to keep fostering that new speaker and encouraging them to come back and do the full presentation. Most of the time when a speaker gets over the initial hump of presenting they want to keep doing it.  This is not true in all cases though so please don’t push people that they are required to do the full presentation.  Presenting might not be for them and most importantly you still want to keep them as a member of the community.  

Constructive Feedback 

The best way to improve as a speaker is good feedback from your audience. Speaker Idol is designed to help with this for speakers. The beginning of every Speaker Idol session starts with a quick 5 minutes on how to give “constructive speaker feedback”. The key items are listed below, you can change these as you need. 

  1. Please be detailed in your feedback and tell me how we can improve. Good examples:
    1. “Speaker needs to talk louder”
    2. “Speaker needs to engage the audience”
    3. “Speaker should be better prepared and have tested demo to make sure it worked properly”
  2. Bad Examples:
    1. “Speaker sucks” (Tell what they did wrong)
    2. “Boring session” (Tell why it was boring)
    3. “Worst session of the day” (Tell why and what went wrong)
  3. Tell the speaker what you did/did not like about the session. “Good examples”
    1. Loved the jokes/comedy, I really wished you would have reviewed this query in more detail.  
    2. You had great attention to the audience but kept forgetting to repeat the question, Please do so in the future. 
    3. I was really hoping to see X information in this presentation since that’s what I got from the Abstract.  

We use a simple google form that allows for audience members to leave feedback.  I know that in most presentations you don’t get a lot of this. That’s why Idol has to focus on it and mention that people should go fill out the form while the next presenter is getting ready. If you are in person’s hand out the forms and say a drawing will be based on the forms handed in. Even if it’s online create a drawing for the audience based on the feedback.  This greatly improves the chance of getting feedback. It’s critical that whoever is running speaker idol reminds everyone to give feedback.  

The Rules (More like “guidelines” adapt as you see fit)

  1. Each presentation should be longer than 5 minutes but less than 15 minutes.Q&A is included in the time.  TIP – focus on one topic and one demo, you are trying to get one key point across of a bigger idea. 
  2. You still will need an Abstract and Bio for the presentation.  TIP – This is a common practice in presentations and you should get used to writing them. 

That’s it,  Pretty simple, right?  Here are some more tips. 

  1. If you have 1-hour meetings schedule 3 presentations.  This gives your audience a good mix of presentations and if you have a big list of participants you can get through. It should give you enough buffer time for feedback and talking as well. 
  2. Plan a speaker gift!  Originally we had this as a competition and crowned a 1st, 2nd, and 3rd place winner. You are welcome to do that. I found in later years I really just wanted to grow speakers and grow the community so I moved this to a speaker gift for those that took part.  This is entirely up to you.
  3. Plan an audience feedback drawing.  As mentioned above feedback goes better if you have a drawing for it.  Plan to get a couple of gift cards or something that you can draw from the feedback.  
  4. Give the feedback quickly to the speakers,  If you do an online form and you have to do any parsing/work to get the data out of the form make sure you do it quickly after the presentation. The more immediate you can make the feedback to the presenter the better.  
  5. DO NOT open up the session for public feedback and discussion.  If you are a small internal team in a company this is OK, if you are a meetup/user group/public meeting with many different people make sure this feedback is done through a form method.  
  6. Do an “Introduction to presenting/Speaking 101”  presentation for the new speakers before the event.  Typically about 2-3 weeks before so they can adjust the presentation if needed.  I have videos I’ve done on this in the past you can use or you can create your own. <link to past video I did for this. Apologies for the not great audio>

I’ve found in years past this is a great way to kick off your year as a user group leader.  It gets people excited and gets a lot of future presenters for you to work with.  Hopefully, you can use this concept to grow your community! 

PGSQL Phriday #004 Postgresql And Software Development

Perhaps a good discussion and drink are all that you really need. 

This Blog post is for the PGSQL Phriday blogging event. Henrietta Dombrovskaya asks about Software Development and PostgreSQL. This is a significant topic, I work with developers frequently and how they use the database(usually telling them it’s time to move away from an ORM). I think for this month I’ll take the topic from that perspective. 

Did you ever have (frustrating) interactions with application developers? Did they end with some truce?

I rarely have non-frustrating interactions with developers, when I start working on a DB project. The first thing I do is have a good discussion with the DB. These are scripts I’m running to do this but it usually goes something like this.

Pat: Hey DB, Let’s check your query and resource utilization and see how you are doing, 

DB: I keep getting these different queries hitting me, none are in functions or procedures, none are optimized and they keep asking for 10 tables in every join?!?!!?! Help! 

Pat: Sound like you have an ORM problem. 

DB: What’s an ORM? Why is it hurting me so much? Please help! 

Pat: An ORM is a way for developers to build something fast, cheap, and easy. They use it to build applications quickly and then rarely if ever find ways to remove those ORMs and make a better application.

DB: Can we charge them with DB abuse? 

Pat: Unfortunately it’s not that easy, but don’t worry I’ll start working with them and we’ll give you some medicine for the short term and long term we will work on removing the bad pieces. 

DB: <Takes some new index pills, and starts feeling better> Thanks! I look forward to being healthy in the future! 

I just realized this would be a perfect “Tell me you’re a DBA without telling me you’re DBA” 

As a data person, whether with PostgreSQL or any other RDBMS we feel a certain sense to “defend” the database. It’s a system that cannot defend itself in most cases. This isn’t wrong to defend the database but you have to be careful how you present this.

We too often assume developers are doing this because they just don’t care about the DB. I find that’s almost never the case. They want to write an optimized, scalable, and efficient system as well they just don’t usually understand how to do that. Most application developers that went through school barely got 1-2 semesters of “How to write SQL”. Most of that time was how to work with 3rd normal form and nothing else. They also don’t think like databases. In Procedural logic in which Application Developers live everything is “IF this happens then do X else do Y”. It bases the ideas on one item going through some sort of logic. Databases think in sets, they think in large chunks of data, and they format based on rows and columns and not just one item. So most application developers at first apply the same thing they did in code to the DB and you end up in a bad situation. 

The key here is not to come into these situations with guns blazing trying to tell the developers how many things they have done wrong to the database. You need to walk into situations like this and focus on what short-term items you can do for the database to give it some time so that you can educate and start discussions with your application developers. 

Does it end in a truce? 

For me, a “truce” means it’s a competition and someone has to win. I don’t see any need for competition. No one has to “win”. They have to create a system that works for the company in a way that works for the company. If the Database is hit a little harder than I would like but it makes better financial and cultural sense for the company then both the application developers and DBA/Data people are winning. That’s what you really want. You don’t want a Truce, you want a working system that makes the company money. 

Thank you, Hettie, for bringing up a great topic! Hopefully, all my readers can think about how they approach working with application developers in the future. Perhaps like in the picture at the beginning they will sit down and have a drink and a good discussion about Databases.