Mastering the Art of Customizing NetworkX Graphs
Written on
Chapter 1: Introduction to NetworkX Customization
Graph theory serves as a powerful tool in data science, enabling the visualization and comprehension of intricate relationships. As part of an open-source initiative, I compiled resources to construct a graph illustrating connections among professional theatre lighting designers in New York City.
To create this graph, I utilized NetworkX, a Python library designed for graph construction. While it comes with practical defaults, incorporating matplotlib allows for extensive customization of the graph's various features. After spending considerable time navigating through documentation and community forums, I decided to compile my findings into a comprehensive guide, ensuring you can also produce clear graphs that effectively convey complex connections.
Section 1.1: Building Your First NetworkX Graph
Let's begin by creating a simple graph! There are multiple methods to achieve this. I discovered that constructing the graph from a pandas DataFrame, which specifies the edges, is the most straightforward approach. So, what exactly is an edge? Graphs consist of nodes and edges; a node represents an entity, such as a person or organization, while an edge signifies the connection between two nodes. In the example below, the nodes are represented as "A," "B," "C," and "D," with the lines between them serving as edges.
Section 1.2: Customizing Node Colors
Changing the color of all nodes is quite simple. You'll find that adjusting a feature for the entire graph can be done easily using keywords in the .draw() method.
Subsection 1.2.1: Node Colors by Type
To customize node colors based on type rather than globally, some additional setup is required. This involves creating a new DataFrame to define node IDs and types, allowing for the application of a colormap using the pd.Categorical() method. As a result, our letter nodes are now displayed in blue, while number nodes appear in orange!
Section 1.3: Adjusting Node Sizes
You can easily modify the global node size by utilizing the keyword argument node_size within the .draw() method.
Subsection 1.3.1: Node Size by Type
Similar to color, node size can also be adjusted by type. This feature is particularly useful when linking individuals to organizations, as organizations tend to have multiple associated individuals, making them logical hubs with people as spokes. We will build upon our previous example, but instead of a singular node_size argument, we'll pass a list of sizes that correspond to each node's type.
Here's the logic behind the list comprehension for those needing guidance:
node_sizes = [4000 if entry != 'Letter' else 1000 for entry in carac.type]
This code sets node sizes to 4000 for non-letter types and 1000 for letter types. While this may seem excessive with only two types currently, it will prove beneficial as the project expands.
Subsection 1.3.2: Manual Node Sizes
If you wish to emphasize specific nodes without changing their size based on type, you can manually specify a list of sizes. Ensure these sizes align with the order of nodes stored in G.nodes(). A size of 5000 often works well for fitting names comfortably.
Section 1.4: Customizing Edge Attributes
Having addressed node attributes, let's turn our attention to edges. Modifying the global width or color of edges is as straightforward as it is for nodes—simply utilize the width keyword in the .draw() method.
Subsection 1.4.1: Individual Edge Attributes
For more detailed customization, edges can be sized or colored individually by providing lists of attributes. Below, we've defined edge_colors and edge_widths that will be cycled through.
Subsection 1.4.2: Node Border Colors
Additionally, nodes can feature colored borders using the keyword "edgecolors," which differs from "edge_color." This feature helps to clarify and separate nodes visually.
Section 1.5: Graph Layout Options
The layout of a graph is crucial, significantly impacting its readability and effectiveness. NetworkX offers various layout options, and I will cover the four most popular ones below. The default layout is spring_layout, but other options may be more suitable depending on your specific needs. I encourage experimenting with different layouts to determine which one best suits your data.
You can explore the layout documentation here.
Chapter 2: A Comprehensive Example
To illustrate these concepts, here’s a complete example from my project. I developed a relationship map highlighting prominent lighting designers alongside notable universities and organizations in the theatre design field. The objective was to analyze how personal connections influence the close-knit community of theatre designers.
You'll notice that node text can also be modified. Keywords like font_size and font_weight allow for customization, and newline characters ("n") enhance readability in node titles. For example, the node for John Gleason appears as "JohnnGleason" in the DataFrame.
Video Description
This video provides an introduction to creating and drawing graphs using NetworkX.
Chapter 3: Conclusion
I hope this guide equips you with practical examples of how to customize various aspects of NetworkX graphs to enhance their readability. NetworkX is an incredibly versatile package, and while its defaults are effective, adjustments to node size, color, edge width, and layout can draw attention to critical information as your projects grow.
Connect with Me
I'm always eager to connect and explore new projects! Feel free to follow me on GitHub or LinkedIn, and check out my other articles on Medium. Additionally, you can find me on Twitter!
Video Description
This video discusses customizing directed graph nodes in Python.