In the realm of worlds where people think that Google is out to get them, there is a popular idea of a duplicate content penalty. As a result, there is a level of unjustified paranoia around ensuring that all content is unique and even efforts made to spin duplicate content into something else. There is an even an obsession with blocking pages to Google’s crawlers if content is not unique.
In fact, duplicate content is an issue but there is no actual penalty applied to anything that is deemed to be duplicate. From a user perspective, Google wants to make sure that all content in the search results is completely unique to other results, so a user doesn’t see a results page with 7-10 listings of the exact same content. This could be content from the same site or even across different sites.
Therefore, when Google identifies duplicate content, they have an algorithm which determines the canonical version of that content. In their analysis they take into account any canonical listings in the source code, but there is no guarantee that they will agree.
As they determine canonical levels of content, they look for authority, user experience and what algorithmically seems like best overall fit.
Provided that the content is not a doorway page intended to trick Google into ranking a page undeserving of being ranked. (City/state pages for example) then duplicate conent is not harmful.
Having duplicate content on a site is in no way an issue that could hurt a website and it should not be avoided. Duplicate content can come in many forms and in many cases it can be very valuable for users. For example, product descriptions are usually sourced from manufacturers and are duplicated across all websites that sell that product. There is no reason to avoid hosting this content or go through the extra effort of changing a few words so it is unique.
As another example, wire news services like the Associated Press or Reuters have their news syndicated across many media sites. If a website such as CNN.com or the NY Times would not include this content, they would be doing their users a disservice.
When it comes to how Google ranks this duplicate content in both of these examples, they will choose the website that best matches the user’s query and allow the duplicate content to rank on the query. Depending on the query, a user may see a product page on Amazon while another user would see Walmart.com in the first position for their query. Query modifiers like “near me”, “reviews” or “free shipping” could be determinants that drive that visibility.
In short, duplicate content if it otherwise fits the overall purpose of a website and was created for users does not need to be avoided. As with everything related to SEO the overarching principle should be whether something is good for users and if it meets that bar it is perfectly safe to use it.